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Introduction 


Neurofibromatosis  type  2  (NF2),  characterized  by  tumors  of  the  brain  and  central  nervous 
system,  is  a  debilitating,  inherited  disorder  affecting  1  in  40,000  people  (Evans  et  al., 
2000).  Genetic  studies  revealed  that  the  disorder  is  caused  by  mutations  in  the  NF2  gene 
which  codes  for  a  protein  known  as  merlin  or  schwannomin  (Lutchman  and  Rouleau, 
1996).  Merlin  is  a  member  of  the  ERM  (Ezrin,  Radixin,  and  Moesin)  family  of  proteins, 
but  it  is  unique  in  that  it  acts  as  a  tumor  suppressor  (Tsukita  et  al.,  1997).  Merlin  exists  in 
two  different  splice  forms  with  different  C-termini,  has  no  catalytic  activity,  functions 
through  interactions  with  other  proteins  and  is  regulated  by  phosphorylation  (Rong  et  al., 
2004).  The  aim  of  our  Project  was  to  provide  insights  into  the  atomic  structure  of  merlin, 
and  into  the  mechanisms  of  selected  protein-protein  interactions  involving  merlin, 
specifically  the  interaction  with  a  regulatory  protein  RhoGDI  which  was  thought  to  link 
merlin  to  the  cytosolic  GTPases  such  as  RhoA.  The  study  was  subsequently  extended  to 
include  another  partner  of  merlin,  a  scaffolding  protein  syntenin. 


Figure  1.  A  schematic  representation  of  the 
closed  and  open  conformations  of  merlin, 
and  its  interaction  with  syntenin.  The 
diagram  also  lists  other  proteins  implicated 
in  interactions  with  either  merlin  or 
syntenin. 


Body 


1.  Merlin  -  the  product  of  the  causal  gene  of  NF2 

1.1.  Determination  of  the  N-terminal  (FERM)  domain  structure. 

During  the  first  year  of  this  project,  the  structure  of  the  N-terminal  domain  (FERM) 
domain,  was  determined  by  us  using  X-ray  crystallography  (Kang  et  al.,  2002).  We 
expressed  the  FERM  domain  as  an  N-terminal  GST  fusion  protein  in  Escherichia  coli 
and  purified  by  standard  methods.  The  structure  was  solved  by  the  molecular  replacement 
method  using  the  radixin  FERM  domain  (Hamada  et  al.,  2000)  as  the  search  model.  The 
final  model  has  an  Rwork  of  19.3%  with  an  Rfree  of  22.7  %  and  agrees  well  with  standard 
protein  geometry  (other  details  can  be  found  in  the  appended  publication).  The  FERM 
domain  is  comprised  of  three  subdomains,  referred  to  as  A,  B,  and  C.  Subdomain  A  is  a 
mixed  a  /  p  domain  that  resembles  ubiquitin,  subdomain  B  is  a  primarily  helical  domain 
with  similarity  to  the  acyl-CoA  binding  protein,  and  subdomain  C  is  a  mixed  a/p  domain 
that  is  similar  to  signaling  domains  such  as  PTB,  PH  and  EVH1.  The  FERM  domain  of 
merlin  is  similar  to  the  analogous  domain  of  radixin  and  moesin  (Pearson  et  al.,  2000), 
albeit  a  number  of  differences  are  conspicuous. 

Although  the  most  severe  cases  of  NF2  are  caused  by  the  complete  or  partial 
deletion  of  merlin,  20  missense  mutations  that  cause  NF2  are  found  in  the  FERM  domain 
of  merlin.  The  ways  in  which  these  mutations  might  affect  the  structure  of  merlin,  and 
consequently  cause  NF2,  are  discussed  in  detail  in  the  appended  publication  (Kang  et  al., 
2002).  Overall,  most  of  the  mutations  can  be  grouped  into  two  categories:  those  that 
would  disrupt  the  hydrophobic  core  of  one  of  the  FERM  subdomains,  and  those  that 
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would  alter  the  3D  arrangement  of  the  subdomains.  The  mutations  that  would  disrupt  the 
hydrophobic  core  of  one  of  the  subdomains  most  likely  lead  to  an  improperly  folded 
protein  that  is  targeted  for  degradation  through  the  ubiquitin-proteasome  pathway 
(Gautreau  et  al.,  2002). 

1.2  The  interactions  of  the  FERM  domain  with  the  C-terminal  domain  within 
merlin 

The  N-  and  C-terminal  domains  of  merlin  can  interact  with  each  other,  leading  to  closed 
and  open  conformations,  with  functional  consequences.  This  head-to-tail  interaction  can 
be  intramolecular  or  intermolecular,  leading  to  'closed'  monomers  or  symmetric  dimers, 
and  similar  interactions  allow  merlin  to  associate  with  other  ERM  proteins  (Sherman  et 
al.,  1997b).  As  was  outlined  in  our  specific  aims,  we  initiated  studies  of  the  interactions 
of  these  domains  using  isothermal  calorimetry  (ITC),  a  sensitive  biophysical  technique 
that  had  heretofore  not  been  employed  with  any  of  the  ERM  proteins.  The  two  isoforms 
of  merlin  differ  in  that  marlin  2  terminates  at  exon  16  (Cl 6)  and  merlin  1  excludes  exon 
16,  but  contains  exon  17  (C17)(Haase  et  al.,  1994).  Merlin  1  is  the  major  and  most 
extensively  researched  isoform,  and  a  predominance  of  the  literature  suggests  that  this  is 
the  only  isoform  capable  of  forming  head  to  tail  interactions. 

To  examine  the  interaction  of  the  N-  and  C-termini  of  merlin,  C16  and  C17 
oligopeptides  were  expressed  as  His-tagged  proteins  and  purified  using  Ni-NTA  and  size 
exclusion  columns.  Prior  to  quantitating  the  interaction  between  these  domains  using 
ITC,  other  biochemical  techniques  were  employed  to  verify  that  the  constructs  we  had 
expressed  were  capable  of  interacting.  Indeed,  pull-down  experiments  where  either  the 
FERM  domain  or  Cl 7  was  bound  to  its  respective  affinity  column  and  the  non-tagged 
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form  of  the  other  protein  was  passed  through  the  column  showed  that  the  N-  and  C- 
terminal  domains  were  able  to  specifically  interact  with  each  other.  Moreover,  when  Cl 7 
and  the  FERM  domain  were  mixed  prior  to  size-exclusion  chromatography,  a  peak 
corresponding  to  FERM-C17  heterodimer  was  observed.  These  initial  experiments 
indicated  that  a  more  thorough  examination  of  the  FERM  -  C-terminal  interaction  was 
warranted. 

In  each  ITC  experiment,  the  FERM  domain  of  merlin  was  placed  in  the  cell  of  the 
ITC  instrument  and  the  C-terminal  domain  was  added  in  small  aliquots  as  the  heat 
required  to  maintain  the  temperature  of  the  system  was  monitored.  All  samples  were 
extensively  dialyzed  into  the  same  buffer  before  conducting  the  experiments  to  reduce 
systematic  errors.  A  dissociation  constant  (Kj)  of  96  nM  and  a  stoichiometry  of  1:1  was 
obtained  for  the  interaction  of  the  FERM  domain  and  Cl 7.  Furthermore,  our  results 
indicate  that  there  is  also  a  weak  1:1  interaction  between  the  FERM  domain  and  Cl 6,  but 
an  accurate  dissociation  constant  could  not  be  obtained.  Representative  data  are  presented 
in  Figure  2. 


B 
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Figure  2  Representative  ITC 
data  showing  the  association  of 
the  N-  and  C-terminal  domains 
of  merlin.  A:  The  FERM 
domain  and  C17  (type  1  merlin) 
B:  The  FERM  domain  and 

C16  (type  2  merlin).  The  top 
panels  show  raw  data  while  the 
bottom  panels  show  the  total 
amount  of  heat  associated  with 
each  injection  is  plotted  as  a 
function  of  the  molar  ratio 
created  in  the  cell  by  the 
injection. 
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Our  plan  was  to  publish  these  data  along  with  the  crystal  structure  of  the  complex 
of  the  N-  and  C-  terminal  domains  of  merlin,  or  with  the  structure  of  the  intact  protein. 
Unfortunately,  despite  numerous  efforts,  we  were  not  successful  in  obtaining 

1.3.  Interaction  of  Merlin  with  RhoGDI 

Data  published  by  others  strongly  suggested  that  the  FERM  domain  of  merlin  interacts 
with  the  Rho-specific  nucleotide  dissociation  inhibitor  (RhoGDI)(Maeda  et  al.,  1999a). 
In  fact  it  had  been  suggested  that  the  interaction  between  ERM  proteins  and  RhoGDI 
constitute  the  basis  for  the  activation  of  Rho-GTPases  by  ERM  proteins.  Our  previous 
work  on  RhoGDI  (Longenecker  et  al.,  1999;  Longenecker  et  al.,  2001),  was  originally 
one  of  the  reasons  why  we  were  interested  in  pursuing  the  structural  and  functional 
studies  of  merlin.  Unfortunately,  contrary  to  earlier  reports,  we  have  been  unable  to  see 
any  evidence  of  a  direct  interaction  of  merlin's  FERM  domain  with  RhoGDI.  When  a 
mixture  of  RhoGDI  and  merlin  were  subjected  to  size  exclusion  chromatography,  no 
peak  corresponding  to  a  complex  was  observed.  Further,  pull-down  experiments  also 
failed  to  detect  any  interaction.  Finally,  ITC  experiments  showed  no  heat  associated  with 
the  mixing  of  these  two  proteins.  Co-crystallization  trials  using  the  FERM  domain  and 
RhoGDI  also  failed  to  produce  any  crystals.  We  conclude  that  the  earlier  reports  of  ERM- 
RhoGDI  interactions  were  based  on  experimental  artifacts.  It  is  noteworthy,  that  since 
this  grant  was  awarded,  no  other  reports  of  a  RhoGDI  —  merlin  interactions,  or  indeed 
ERM  protein  -  RhoGDI  interactions,  have  appeared  in  the  literature. 

2.  Structural  Studies  of  Syntenin 

With  the  consent  of  DOD  and  our  reviewers,  we  shifted  attention  from  RhoGDI  to 
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syntenin,  another  partner  of  merlin  (Jannatipour  et  al.,  2001).  This  protein  has  also  been 
shown  to  bind  to  a  number  of  other  proteins  involved  in  cell  proliferation  and  cellular 
regulation,  making  its  association  with  a  tumor  suppressor,  such  as  merlin,  especially 
interesting.  Syntenin  is  a  cytosolic  protein  with  two  PDZ  domains  connected  by  a  four 
residue  linker.  It  was  originally  noted  for  its  role  in  syndecan-mediated  signaling.  The 
PDZ  tandem  is  preceded  in  syntenin  by  a  112  residue  N-terminal  domain  of  unknown 
function  and  followed  by  a  short  C-terminal  domain  (Grootjans  et  al.,  1997).  We 
determined  the  crystal  structure  of  a  number  of  variants  of  syntenin,  alone  and  in  the 
presence  of  peptide  ligands,  some  that  correspond  to  natural  ligands  and  some  that  were 
designed  to  test  the  specificity  of  the  two  PDZ-domains.  Our  structural  studies  have 
provided  significant  insight  into  the  mechanism  of  ligand  recognition  by  PDZ  domains  of 
syntenin,  and  lead  to  a  generalized  proposal  of  the  “combinatorial”  model  of  peptide 
recognition  by  PDZ  domains.  This  model  accounts  for  many  observations  that  were 
incongruous  with  the  canonical  model  of  PDZ  peptide  recognition. 

All  of  the  constructs  of  syntenin  used  in  this  study  were  generated  using  a  clone 
of  human  syntenin  obtained  from  ATCC  (ATCC  72537).  The  gene  was  subcloned  into 
the  pGST-parallel  1  vector  to  insert  the  following  fragments  behind  a  GST  affinity  tag; 
full  length  (residues  1-298),  PDZ1  (113-193),  PDZ2  (197-273  and  197-270),  Al  19  (120- 
298),  PDZ2-C  (197-298),  and  the  tandem  (residues  113-273).  Expression  and  purification 
details  are  provided  in  the  appended  publications. 

2.1.  The  structure  of  the  PDZ  tandem  of  syntenin 

The  first  study  carried  out  by  us  in  this  area  was  that  of  the  PDZ  tandem  (residues  113- 
273).  This  original  crystal  structure  was  solved  using  a  three  wavelength  MAD 
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experiment  at  a  synchrotron  beamline  X9B  of  the  NSLS.  The  model  was  refined  at  1.9  A 
resolution  to  an  Rwork  of  17.7%  and  a  Rfa*  of  23.6%  (Kang  et  al.,  2003b).  The  atomic 
model  revealed  two  PDZ  tandems  in  the  asymmetric  unit  arranged  in  a  head-to  tail 
fashion  and  related  by  a  non-crystallographic  two-fold  symmetry  axis.  Like  other 
domains  from  this  superfamily,  the  syntenin  PDZ  modules  show  a  typical  fold  with  two 
opposing  antiparallel  p-sheets  capped  by  two  a-helices.  Each  domain  has  at  least  one  P- 
strand  that  is  partly  contained  in  both  sheets.  There  is  some  variation  in  the  linker  region 
between  the  two  monomers,  in  part  because  of  a  very  slight  difference  in  the  angles 
between  the  two  PDZ  domains  in  the  different  monomers.  There  is  a  quite  extensive 
interface  between  the  two  monomers  in  the  asymmetric  unit. 

Although  the  sequence  identity  between  the  two  PDZ  domains  is  modest  (26%), 
the  two  domains  are  very  similar  to  each  other,  with  and  rms  deviation  of  only  1 .2  A  for 
the  Ca  atoms.  The  main  differences  between  the  domains  are  the  length  of  the  p2-p3 
loop,  which  is  4  residues  longer  in  PDZ1,  and  the  width  of  the  peptide  binding  groove.  A 
hydrogen  bond  between  a  a2  residue  (Ser  170)  and  a  backbone  amide  of  P2  (Seri  31) 
tethers  the  distal  end  of  the  peptide  binding  groove,  making  the  helix  to  strand  distance 
1.8  A  shorter  in  PDZ1  than  PDZ2. 

The  crystal  structure  suggested  that  syntenin  has  a  defined  supramodular 
architecture.  This  was  further  confirmed  by  the  stability  studies  carried  out  in 
collaboration  with  Dr.  Otlewski  (University  of  Wroclaw,  Poland).  Solvent  denaturation 
experiments  showed  that  the  isolated  PDZ1  and  PDZ2  have  significantly  different 
stabilities.  The  free  energy  of  unfolding,  AGun,  for  PDZ1  is  -3.2  kcal/mole,  while  for 
PDZ2  it  is  -4.8  kcal/mole.  The  experimental  unfolding  of  the  tandem  follows  a 
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cooperative,  two-state  profile,  with  a  AGun  of  -4.1  kcal/mole.  Further,  the  full-length 
protein  unfolds  in  a  highly  cooperative  manner,  and  shows  significantly  higher  stability 
(AGun  of  -6.4  kcal/mole)  than  any  of  the  other  constructs.  It  is  therefore  clear  that  the  two 
PDZ  domains  interact  within  the  protein  and  that  the  N-  and  C-terminal  extensions  also 
play  a  structural  role  (Kang  et  al.,  2003b). 

To  further  investigate  the  structure  of  syntenin  in  solution,  we  recently  used  NMR 
spectroscopy  (in  collaboration  with  Dr.  John  Busweller,  UVA).  According  to  15N{!H} 
NOEs  and  15N  relaxation  times,  the  PDZ  tandem  is  monomeric  in  solution  and  the  two 
domains  tumble  as  a  single  unit  with  a  rotational  correlation  time  of  10  ns.  The  accurate 
arrangement  of  the  PDZ  domains  in  solution  has  been  determined  from  residual  dipolar 
couplings.  While  it  is  similar  to  the  crystal  structure,  the  domains  are  rotated  by 
approximately  -5°,  3°,  and  -23°  about  the  x,  y,  and  z  axes.  These  different  angels  between 
domains  found  in  the  NMR  and  crystal  structure  suggest  that  the  linker  region  is 
somewhat  flexible  and  that  the  relative  orientation  of  the  two  domains  is  not  completely 
fixed,  at  least  in  the  absence  of  the  N-  and  C-terminal  domains. 

Furthermore,  we  analyzed  NMR  spectra  of  various  syntenin  fragments,  including 
the  full-length  protein.  The  comparison  of  the  *H-15N  HSQC  spectra  of  full  length 
syntenin  to  those  of  the  PDZ  tandem  reveals  an  increased  number  of  amide  resonances 
with  non-random  coil  chemical  shifts.  This  indicates  that  there  are  fragments  extraneous 
to  the  PDZ  domains  that  are  structured  in  the  full-length  protein. 

The  experimental  work  has  just  been  completed  and  a  manuscript  was  submitted 
for  publication  (Cierpicki  et  al.,  2004;  appended). 
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2.1.  The  structure  of  the  isolated  PDZ2  domain. 


In  addition  to  the  studies  of  the  PDZ  tandem  of  syntenin,  we  carried  out  extensive 
crystallization  trials  for  the  isolated  PDZ  domains.  PDZ1  never  crystallized,  but  we 
solved  the  structure  of  the  isolated  PDZ2  domain  of  syntenin.  The  original  PDZ2 
construct  contained  residues  197-273  and  the  crystals  yielded  an  ultra-high  resolution 
structure  that  was  refined  to  0.73  A  resolution  with  an  R- factor  of  7.5%  and  an  Rfree  of 
8.7%,  making  it  one  of  the  highest  resolution  and  most  precisely  refined  protein 
structures  determined  to  date  (Kang  et  al.,  2004).  Thus,  the  impact  of  this  result  extended 
well  beyond  the  NF2  project. 

The  crystals  of  the  isolated  PDZ2  described  above  were  not  suitable  for 
crystallographic  studies  of  protein-peptide  complexes  because  the  peptide  binding  groove 
of  one  PDZ  domain  binds  the  C-terminus  of  a  crystallographically  related  molecule  in 
such  a  way,  that  the  C-terminal  phenylalanine  of  one  PDZ  domain  occupies  the  So  pocket 
of  another  molecule,  and  the  third  residue  of  the  construct  (Met200)  occupies  the  S.2  site 
with  hydrophobic  interactions  with  Phe213.  Thus,  soaking  in  peptides  or  even  co¬ 
crystallization  is  not  feasible.  To  circumvent  this  problem,  the  last  three  residues  of  this 
construct  were  truncated  resulting  in  a  construct  encompassing  residues  197-270.  This 
new  construct  proved  to  be  suitable  for  co-crystallizations  with  peptides,  as  will  be 
documented  below. 

2.2.  The  interaction  of  syntenin  with  peptides  derived  from  its  partner 
proteins. 

The  apo-tandem  crystal  structure  was  not  sufficient  to  predict  which  PDZ  domain  was 
responsible  for  binding  which  of  the  other  numerous  binding  partners,  including  merlin. 
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In  fact,  the  literature  suggested  that  at  least  in  some  cases  synergistic  action  by  both  PDZ 
domains  is  required  for  biological  activity(Grootjans  et  al.,  2000),  although  this  could  not 
be  immediately  rationalized  by  the  structure.  The  canonical  classification  scheme  for 
PDZ  domains  groups  them  into  three  classes  based  on  the  sequence  of  the  ligand  peptide. 
Class  I  domains  have  been  defined  as  those  that  bind  ligands  with  a  C-terminal  sequence 
containing  a  serine  or  threonine  in  P_2  and  a  hydrophobic  residue  in  Po,  so  that  the 
consensus  motif  is  -S/T-X-®,  where  <E>  represents  a  hydrophobic  residues.  Class  II 
domains  bind  ligands  with  hydrophobic  residues  at  P_2  and  Po  (-O-X-®),  while  a 
negatively  charged  residue  at  P_2  defines  class  III  interactions  (-D/E-X-®).  Interestingly, 
although  syntenin  only  has  two  PDZ  domains,  it  has  been  shown  to  bind  ligands  of  all 
three  classes.  Recently,  several  PDZ  domains  have  also  been  found  capable  of  binding 
more  than  one  class  of  peptide.  Thus,  the  shortcomings  of  this  classification  scheme  are 
becoming  readily  apparent,  and  other  groups  suggested  expanding  the  classification 
scheme  to  include  novel  classes  (Bezprozvanny  and  Maximov,  2001).  It  is  with  these 
questions  in  mind  that  we  conducted  extensive  structural  and  biophysical  studies  of  the 
PDZ  domains  of  syntenin  in  the  presence  of  a  number  of  peptides  to  determine  the 
mechanism  of  peptide  recognition. 

Binding  experiments  were  conducted  to  determine  which  PDZ  domain  of  syntenin 
is  responsible  for  binding  which  of  the  following  three  ligands:  IL5Ra  (class  I),  syndecan 
(class  II)  and  merlin  (class  III)  (Kang  et  al.,  2003b).  Isothermal  titration  calorimetry 
(ITC)  was  used  to  determine  the  Kd  and  stoichiometry  of  binding  to  syntenin.  The 
syntenin  constructs  used  in  these  first  experiments  were  the  full-length  syntenin  and  the 
PDZ  tandem.  For  merlin  and  IL5Ra,  octapeptides  and  hexapeptides  were  evaluated.  All 
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of  the  peptides  bind  to  full-length  syntenin  and  to  PDZ  tandem  with  dissociation 
constants  (IQ)  in  the  low  pM  range.  For  the  merlin  and  syndecan  peptides,  a  1:1  binding 
stoichiometry  was  observed.  Similarly,  and  1:1  stoichiometry  was  found  for  the  IL5Ra 
peptide  and  full-length  syntenin,  but  a  2:1  ratio  was  found  when  this  peptide  was  bound 
to  the  PDZ  tandem.  This  not  only  suggests  that  this  peptide  is  capable  of  binding  to  both 
PDZ  domains,  but  also  that  the  non-PDZ  regions  of  syntenin  posses  a  regulatory 
function.  Furthermore,  for  the  IL5Ra  peptide,  the  length  of  the  peptide  did  not  have  a 
significant  affect  on  binding,  but  the  merlin  octapeptide  binds  an  order  of  magnitude 
tighter  than  the  hexapeptide,  suggesting  that  the  P-7  and  P-8  residues  play  a  role  in 
determining  binding  specificity.  Although  experiments  involving  isolated  PDZ2  were 
quite  feasible,  the  poor  solubility  of  the  isolated  PDZ1  precluded  similar  experiments  and 
made  it  necessary  to  resort  to  fluorometric  titrations  using  dansylated  peptides,  with 
much  lower  protein  concentration  required  for  measurements.  These  experiments  were 
also  conducted  in  collaboration  with  Dr.  Otlewski.  The  IL5Ra  peptide  was  found  to 
interact  with  both  PDZ  domains,  slightly  more  strongly  with  PDZ2.  The  merlin  peptide 
was  found  to  interact  only  with  PDZ1  and  the  syndecan  peptide  interacted  only  with 
PDZ2. 

We  then  used  X-ray  crystallography  to  determine  the  structure  of  the  PDZ2 
domain  with  two  peptide  ligands  (Kang  et  al.,  2003a),  and  the  structures  of  the  tandem  in 
complex  with  six  peptides  as  described  below  (Grembecka  et  al;  in  preparation). 

In  order  to  explain  the  degenerate  specificity  observed  in  the  PDZ2  domain,  we 
first  solved  crystal  structures  of  PDZ2  in  complexes  with  the  IL5Ra  and  syndecan 
peptides.  This  study  lead  to  a  proposal  of  combinatorial  peptide  recognition  (Kang  et  al., 
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2003a).  Recently,  we  extended  those  studies  to  include  the  structures  of  the  PDZ-tandem 
with  peptides  derived  from  two  other  partners  of  syntenin,  i.e.  neurexin  4 
(TNEYYV)(Grootj ans  et  al.,  2000),  and  ephrin  B  (TNETKV)(Lin  et  al.,  1999).  We  also 
designed  peptides  with  non-natural  sequences  to  probe  the  influence  of  particular 
residues  on  the  peptide  recognition  mechanism.  The  peptide  series  TNEFYF,  TNEFAF, 
and  TNEAYF  was  used  to  determine  the  binding  behavior  when  then  peptide  contained 
three  aromatic  residues  and  the  influence  of  substitutions  at  the  P„i  and  P.2  position. 
These  results  are  all  new  and  are  currently  being  prepared  for  publication.  Basic 
crystallographic  data  for  all  these  unpublished  structures  are  presented  in  Table  I. 
Because  no  other  PDZ  domain  has  had  its  structure  determined  with  such  an  array  of 
ligands,  these  studies  will  provide  a  wealth  of  information  on  the  general  mechanism  of 
PDZ  domain-ligand  recognition  and  specificity,  in  addition  to  specific  insight  into  the 
structure- function  relationships  of  merlin. 

Table  1.  Crystallographic  data  for  the  syntenin  PDZ  tandem  -  peptide  complexes. 


Peptide 

TNEFYF 

TNEYYV 

TNEYKV 

TNEAYF 

TNEFAF 

ESYF 

P41212 

P41212 

P41212 

P41212 

a=b  (A) 

72.207 

72.134 

72.341 

71.918 

72.192 

72.091 

c(A) 

126.046 

127.342 

Resolution  (A) 

63.25-  1.56 
(1.60- 
1.56) 

H§gi 

63.25-  1.80 
(1.85- 
1.80) 

63.25  -  1.70 
(1.74- 
1.70) 

HW 

wmSSymKKM 

47197 

15024 

30277 

28136 

36073 

36350 

989 

945 

993 

1197 

1162 

1169 

100.0 

99.6 

98.7 

92.4 

99.6 

99.9 

ILUifc&l 

KBilEEiSl 

18.1  (23.5) 

mmmm 

22.6  (31.0) 

Number  of  waters 

381 

200 

281 

325 

370 

395 

0.017 

0.011 

0.011 

0.012 

0.018 

Angles  (°) 

1.494 

1.475 

1.263 

1.203 

1.596 

Peptide  in  PDZ1 

Yes 

No 

No 

Yes 

No 

Yes 

Peptide  in  PDZ2 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 
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The  structures  of  syntenin’s  PDZ  tandem  in  complexes  with  peptides  show  many 
interesting  and  novel  features  that  substantially  alter  our  current  understanding  of  these 
modules.  With  respect  to  the  PDZ2  domains  in  all  complexes,  the  terminal  carboxylate  of 
the  peptide  always  accepts  three  hydrogen  bonds  from  the  amides  of  the  “carboxylate 
binding  loop”  that  precedes  p2  (  Val209,  Gly210,  and  Phe211  )  and  the  last  side  chain  of 
the  peptides  fit  into  a  hydrophobic  Po  pocket  defined  by  Val209,  Phe211,  Phe213,  and 
Leu258.  However,  although  the  classical  model  of  PDZ  domain  peptide  recognition 
requires  that  the  P.2  residue  participate  in  specificity  determination,  the  P.2  Ser  of  IL5Ra 
(ETLEDSVF)  and  of  the  ESYF  peptide  does  not  directly  interact  with  the  PDZ  domain 
and  the  peptide  backbone  is  slightly  displaced  from  a2  at  the  distal  end  of  the  binding 
pocket.  Indeed,  although  these  peptides  are  class  I  peptides,  PDZ2  of  syntenin  does  not 
even  have  a  His  at  the  beginning  of  a2  as  is  required  by  the  classical  model.  In  contrast  to 
the  missing  interaction  at  P_2,  the  side  chains  at  P.i  fit  into  a  well  defined  hydrophobic  S.i 
pocket  formed  by  His208,  Ile212,  and  Val222. 

Although  an  interaction  of  the  P.i  residue  with  the  PDZ  domain  is  not  included  in 
the  classical  model,  it  is  seen  in  a  number  of  other  structures  determined  in  this  study.  In 
the  PDZ2  structure  with  a  syndecan  peptide  (TNEFYA),  the  P.i  Tyr  is  situated  in  the  S.i 
pocket  with  the  aromatic  ring  of  the  tyrosine  stacking  against  His208.  Similar  interactions 
are  observed  in  other  class  II  peptides,  including  ephrin  B  (TNEYYV),  neurexin  4 
(TNEYKV),  and  the  hydrophobic  peptides  TNEFYF  and  TNEAYF.  While  the  rest  of  the 
interactions  for  ephrin  B  and  neurexin  4  are  typical  of  class  II  interactions  with  PDZ 
domains,  the  TNEFYF  and  TNEAYF  peptides  do  not  interact  with  the  S.2  site. 

Three  of  the  peptides  (TNEFYF,  TNEAYF  and  ESYF)  bound  to  both  PDZ2  and 
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PDZ1.  However,  the  binding  of  these  peptides  to  PDZ1  is  strikingly  different  from  their 
binding  to  PDZ2,  and  deviates  significantly  from  the  canonical  model.  Although  the  C- 
terminal  carboxylate,  Po  and  P.i  interactions  are  similar  to  the  canonical  interaction,  the 
peptide  then  turns  sharply,  either  crossing  over  02  or  exposing  the  peptide  to  solvent.  An 
example  of  this  type  of  non-canonical  binding  is  seen  in  Figure  3. 


Figure  3.  The  structure  of  the  PDZ  tandem  of  syntenin  with  an  ESYF  peptide.  The  syntenin 
structure  is  shown  in  a  ribbon  representation  colored  from  blue  (N-terminus)  to  red.  PDZ1  is  on  the 
right  and  PDZ2  is  on  left.  The  peptide  is  represented  by  a  coil  for  the  main  chain  with  all  atoms  for 
the  side  chains.  Notice  that  although  the  peptide  in  PDZ2  follows  the  peptide  binding  groove,  the 
peptide  in  PDZ1  only  interacts  in  the  C-terminal  region  of  the  peptide. 


The  interaction  of  the  syndecan  (TNEFYA),  neurexin  (TNEYYV),  ephrin  B 
(TNEYKV)  and  the  hydrophobic  (TNEFYF)  peptides  with  the  PDZ  tandem  of  syntenin 
was  also  investigated  with  NMR  spectroscopy,  to  verify  if  the  interactions  see  in  the 
crystals  represent  those  in  solution.  The  binding  of  peptides  was  analyzed  from  a  series  of 
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'H-isN  HSQC  spectra  recorded  for  PDZ  tandem  titrated  with  increasing  concentrations  of 
peptides.  All  of  the  peptides  were  found  to  bind  to  both  PDZ1  and  PDZ2,  albeit  with 
significantly  different  changes  in  the  chemical  shifts  for  PDZ1  and  PDZ2.  The  most 
noticeable  changes  are  found  within  the  PDZ2  domain,  but  there  are  significant 
perturbations  of  the  chemical  shifts  for  the  TNEFYF  and  TNEYYV  peptides.  The 
weakest  alterations  resulted  from  the  interaction  of  the  TNEFAF  peptide,  which  is 
consistent  with  the  fact  that  this  peptide  was  not  found  in  the  PDZ1  binding  pocket  in  the 
X-ray  structure.  Figure  4  shows  the  crystal  structure  of  syntenin  PDZ  tandem  with  the 
residue  surfaces  color  coded  to  indicate  the  changes  in  the  chemical  shift  upon  peptide 
binding. 


Figure  4.  Differences  in  NMR  chemical  shifts  caused  by  peptide  binding.  The  surface  of  the  crystal 
structure  of  the  PDZ  tandem  has  been  colored  according  to  the  magnitude  of  the  chemical  shift 
differences  between  the  PDZ  tandem  and  the  PDZ  tandem  with  peptide.  Colors  range  from  blue  (no 
change)  to  red  (largest  change).  The  EFYA  peptide  is  shown  in  each  case  for  reference.  In  each  case, 
PDZ1  is  on  the  right  and  PDZ2  is  on  the  left. 
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NMR  titration  experiments  were  also  used  to  assess  the  strength  of  interactions  of 
the  PDZ  tandem  with  peptides.  One  of  the  advantages  of  this  method  is  simultaneous 
quantification  of  interactions  of  the  peptides  with  both  domains.  Binding  constants  have 
been  obtained  from  the  fitting  of  chemical  shift  changes  as  a  function  of  a  ligand 
concentration.  For  all  peptides  we  observed  fast  exchange  kinetics  in  agreement  with 
moderate  affinities  obtained  from  ITC  measurements  for  PDZ2.  In  addition,  we  were  able 
to  measure  interactions  with  syntenin  PDZ1  domain.  Interestingly,  the  peptides  derived 
from  syndecan,  neurexin  and  ephrin  B  bind  to  PDZ1  with  at  least  10  times  weaker 
affinities  than  to  PDZ2.  The  interaction  of  TNEYKV  with  PDZ1  is  difficult  to  quantify, 
as  it  is  weaker  than  10  mM.  Binding  constants  obtained  using  ITC  and  NMR  titration 
experiments  agree  very  well  for  the  three  hexapeptides.  These  data  are  presented  in  more 
detail  in  Table  II. 

Table  2.  Binding  affinities  of  PDZ2  probed  by  ITC  and  NMR 


II1 . fTi'Tl 

NMR  (PDZ  Tandem) 

Peptide 

Kd(pM) 

TNEFYA 

115 

76  ±1.6 

160  ±10 

TNEFYF 

98 

1.00  ±0.07 

790  ±  30 

TNEFAF 

-600 

n.d. 

n.d. 

-1000 

n.d. 

n.d. 

96 

0.74  ±  0.05 

96  ±9 

TNEYKV 

>400 

>10 

115  ±90 

EFYF 

105 

n.d. 

n.d. 

YYF 

102 

n.d. 

n.d. 

A  large  number  of  crystallographic  experiments  were  conducted  in  an  attempt  to 
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characterize  the  interaction  of  merlin  peptides  in  the  active  site  of  syntenin’s  PDZ 
domains.  Both  co-crystallization  of  the  syntenin’s  PDZ  tandem  and  octapeptide  and 
hexapeptide  and  soaking  these  peptides  into  already  formed  crystals  was  attempted. 
Crystals  generated  using  both  of  these  techniques  were  used  to  collect  diffraction  data 
and  the  structures  were  determined.  Unfortunately,  no  peptides  were  observed  in  the 
active  site. 


Key  Research  Accomplishments 

•  X-ray  crystallographic  determination  of  the  1.8  A  resolution  X-ray  crystal 
structure  of  the  N-terminal  (FERM)  domain  of  merlin,  the  product  of  the  NF2 
causal  gene. 

•  X-ray  crystallographic  determination  of  the  structure  of  the  tandem  PDZ  domain 
of  syntenin,  alone  and  complexed  with  an  assortment  of  peptides,  including 
several  corresponding  to  physiological  binding  partners  of  syntenin. 

•  Ultra-high  resolution  X-ray  crystallographic  determination  of  the  PDZ2  domain 
of  syntenin  at  0.73  A.  This  is  currently  the  most  precisely  determined  high 
resolution  structure  in  the  Protein  Data  Bank. 

•  X-ray  crystallographic  determination  of  the  structure  of  the  PDZ2  domain  of 
syntenin  complexed  with  an  assortment  of  peptides,  including  several 
corresponding  to  physiological  binding  partners  of  syntenin. 

•  Calorimetric  and  biochemical  analysis  of  the  interaction  of  the  PDZ  domains  of 
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syntenin  with  physiologically  relevant  peptides,  including  the  identification  of 
PDZ1  as  the  domain  that  binds  merlin,  PDZ2  as  the  domain  that  binds  syndecan, 
and  both  as  capable  of  binding  IL5Ra. 

•  NMR  determination  of  the  structure  of  the  PDZ  tandem  of  syntenin  alone  and  in 
complex  with  various  peptides. 

•  Design  and  publication  of  the  ‘combinatorial’  model  of  peptide  recognition  by 
PDZ  domains. 

Reportable  Outcomes 

Publications: 

Beom  Sik  Kang,  David  Cooper,  Yancho  Devedjiev,  Urszula  Derewenda,  Zygmunt 
Derewenda.  The  structure  of  the  FERM  domain  of  merlin,  the  neurofibromatosis 
type  2  gene  product.  Acta  Crystallographica  D  Biological  Crystallography.  2002. 
58  (Pt  3):38 1  -91 . 

Beom  Sik  Kang,  David  Cooper,  Filip  Jelen,  Yancho  Devedjiev,  Urszula  Derewenda, 
Zbigniew  Dauter,  Jacek  Otlewski,  Zygmunt  S.  Derewenda.  PDZ-tandem  of 
human  syntenin:  crystal  structure  and  functional  properties.  Structure.  2003. 
11:459-468. 

Beom  Sik  Kang,  David  Cooper,  Yancho  Devedjiev,  Urszula  Derewenda,  Zygmunt  S. 
Derewenda.  Molecular  roots  of  degenerate  specificity  in  syntenin’ s  PDZ2 
domain:  reassessment  of  the  PDZ  recognition  paradigm.  Structure.  2003.  1 1:845- 
853. 

Beom  Sik  Kang,  Yancho  Devedjiev,  Ulla  Derewenda,  Zygmunt  Derewenda.  The  PDZ2 
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Domain  of  Syntenin  at  Ultra-high  Resolution:  Bridging  the  Gap  Between 
Macromolecular  and  Small  Molecule  Crystallography.  Journal  of  Molecular 
Biology.  2004.  338:483-493. 

Tomasz  Cierpicki,  John  H.  Bushweller  and  Zygmunt  S.  Derewenda.  Probing  the 
supramodular  architecture  of  a  multidomain  protein:  the  solution  structure  of 
syntenin.  Submitted  to  Structure 

Two  papers,  dealing  with  the  specificity  of  peptide  recognition  by  syntenin,  are 
currently  in  preparation. 


Presentations: 

Oral  Presentation:  David  Cooper.  Crystal  Structure  of  the  FERM  Domain  of  Merlin: 
The  Neurofibromatosis  2  Tumor  Suppressor  Protein.  29th  Mid- Atlantic 
Crystallographic  Workshop.  Williamsburg,  Virginia. 

Poster  Presentation:  David  Cooper,  Beom  Sik  Kang,  Peter  Sheffield,  Yancho 
Devedjiev,  Zygmunt  Derewenda.  Crystal  Structure  of  the  FERM  Domain  of 
Merlin,  The  Neurofibromatosis  2  Tumor  Suppressor  Protein.  American 
Crystallographic  Association  Annual  Meeting.  2001,  Los  Angeles,  California. 
Poster  Presentation:  David  R.  Cooper,  Beom  Sik  Kang,  Yancho  Devedjiev,  Mary  E. 
Lewis,  Zbigniew  Dauter,  Zygmunt  Derewenda.  The  Tandem  PDZ  Domains  of 
Syntenin.  30th  Mid- Atlantic  Crystallographic  Workshop.  2001,  Frederick, 
Maryland. 

Poster  Presentation:  David  R.  Cooper,  Beom  Sik  Kang,  Yancho  Devedjiev,  Ulla 
Derewenda,  Mary  E.  Lewis,  Zbigniew  Dauter,  Zygmunt  Derewenda.  The  Crystal 
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Structure  of  the  PDZ  Tandem  of  Syntenin.  American  Crystallographic  Association 
Annual  Meeting.  2002,  San  Antonio,  Texas.  Awarded  Oxford  Cryosystems 
Poster  Prize. 

Poster  Presentation:  Jolanta  Grembecka,  Tomasz  Cierpicki,  Beom  Sik  Kang,  Milton 
Brown,  and  Zygmunt  Derewenda.  Towards  Rational  Design  of  Selective  Ligands 
for  Syntenin  PDZ  Domains  Structure-based  drug  design  conference.  2003, 
Boston,  MA. 

Protein  Data  Bank  Depositions: 

1H4R  -  Crystal  Structure  Of  The  FERM  Domain  Of  Merlin,  The  Neurofibromatosis  2 
Tumor  Suppressor  Protein. 

1N99  -  Crystal  Structure  Of  The  PDZ  Tandem  Of  Human  Syntenin. 

1NTE  -  Crystal  Structure  Analysis  of  The  Second  PDZ  Domain  Of  Syntenin. 

10BX  -  Crystal  Structure  Of  The  Complex  Of  PDZ2  Of  Syntenin  With  An  Interleukin  5 
Receptor  Peptide. 

10BY  -  Crystal  Structure  Of  The  Complex  Of  PDZ2  Of  Syntenin  With  A  Syndecan-4 
Peptide. 

10BZ  -  Crystal  Structure  Of  The  Complex  Of  The  PDZ  Tandem  Of  Syntenin  With  An 
Interleukin  5  Receptor  Peptide. 

1 R6 J  -  Ultra-high  Resolution  Crystal  Structure  Of  Syntenin  PDZ2 . 

1V1T  -  Crystal  structure  of  the  PDZ  tandem  of  human  syntenin  with  TNEYKV  peptide. 

The  following  structures  will  be  deposited  in  the  protein  data  bank. 

•  PDZ  tandem  of  syntenin  with  ESYF  peptide 
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•  PDZ  tandem  of  syntenin  with  TNEAYF  peptide 

•  PDZ  tandem  of  syntenin  with  TNEFYF  peptide 

•  PDZ  tandem  of  syntenin  with  TNEFYF  peptide 

•  PDZ  tandem  of  syntenin  with  TNEYYV  peptide 


All  personnel  involved  at  different  stages  of  the  project 

Principle  Investigator 

Zygmunt  Derewenda 
Research  Faculty 

•  Urszula  Derewenda 

•  Y antcho  Devedj  iev 

Research  Associates 

•  Tomasz  Cierpicki 

•  David  Cooper 

•  Jolanta  Grembecka 

•  Beom  Sik  Kang 

Technical  Assistants 

•  Holly  Barton 

•  Neelima  Choudhary 

•  Mary  Lewis 

•  Natalya  Oleknovich 


Conclusions 

Determination  of  the  N-terminal  (FERM)  domain  structure 

The  structure  of  the  FERM  domain  of  merlin  yields  significant  insight  into  the  function 
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of  merlin.  Although  the  structures  of  several  other  FERM  domains  have  now  been  solved 
(Hamada  et  al.,  2000;  Pearson  et  al.,  2000),  the  merlin  structure  is  critical  for  the 
understanding  neurofibromatosis  type  2,  since  merlin  is  the  only  ERM  protein  that  has  a 
tumor  suppressor  function  (Gusella  et  al.,  1999).  With  that  in  mind,  the  structure  of  the 
FERM  domain  was  analyzed  to  see  how  it  relates  to  NF2.  A  large  number  of  missense 
mutations  are  speculated  to  alter  the  three  dimensional  arrangement  of  the  subdomains, 
suggesting  that  the  overall  tertiary  structure  of  the  FERM  domain  is  crucial  for  the  proper 
function  of  merlin.  The  majority  of  the  missense  mutations  that  affect  the  tertiary  fold  of 
the  FERM  domain  do  so  by  altering  the  interface  between  the  first  and  second 
subdomains.  This  fact  by  itself  might  be  overlooked,  except  for  the  fact  that  this  interface 
has  several  other  unique  characteristics.  First,  it  is  the  only  region  of  the  FERM  domain 
of  merlin  that  has  a  dramatically  different  electrostatic  potential  from  other  published 
FERM  domains.  Secondly,  it  is  flanked  by  clusters  of  residues  that  are  conserved  among 
the  other  FERM  domains,  yet  different  in  merlin.  These  patches  of  residues  create  surface 
epitopes  that  are  unique  to  merlin  among  the  FERM  proteins  and  probably  affect  the  way 
merlin  interacts  with  effectors.  Taken  together,  this  implicates  the  face  of  the  FERM 
domain  that  is  shared  by  the  first  and  second  subdomains  of  merlin  as  critical  for  the 
tumor  suppressing  function  of  merlin.  A  more  thorough  discussion  of  the  conclusions  of 
the  structural  determination  of  the  FERM  domain  of  merlin  is  detailed  in  the  Appendix 
(Kang  et  al.,  2002). 

The  solution  of  the  crystal  structure  of  merlin's  FERM  domain  is  the  most 
important  accomplishment  of  this  project.  Merlin  is  the  sole  key  molecule  in  NF2 
pathogenesis,  and  the  structure  of  this  molecule  puts  the  structural  biology  relevant  to  the 
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disease  on  a  strong  footing. 

The  interaction  of  the  N-  and  C-terminus  of  merlin  is  well  documented,  but  the 
prevalent  view  in  the  literature  is  that  only  merlin  1,  with  a  C-terminal  tail  encoded  by 
exon  17,  is  able  to  interact  with  the  FERM  domain.  The  significance  of  this  is  enhanced 
by  the  fact  that  only  merlin  1  shows  a  tumor  suppressor  function  (Pearson  et  al.,  2000; 
Sherman  et  al.,  1997a).  Our  experiments  certainly  confirm  that  the  N-  and  C-termini  of 
merlin  1  do  interact  with  a  K<j  of  96  nM  ,  but  also  detect  a  weak,  but  present,  interaction 
of  the  C-termini  of  merlin  2.  It  is  possible  that  we  were  able  detect  this  interaction 
specifically  due  to  the  sensitive  nature  of  isothermal  titration  calorimetry. 

The  interaction  of  merlin  with  RhoGDI 

At  the  time  when  our  proposal  was  being  prepared,  there  was  a  significant  level  of 
excitement  in  the  community  associated  with  the  reported  observations  suggesting  that 
some  ERM  proteins,  including  merlin,  interact  with  RhoGDI  .(Maeda  et  al.,  1999b; 
Takahashi  et  al.,  1997).  Given  our  interest  in  RhoGDI  and  RhoA-mediated  signaling 
pathways  (Longenecker  et  al.,  1999;  Wei  et  al.,  1997),  we  were  keen  to  pursue  this 
avenue  of  research  and  we  had  the  means  to  do  it.  However,  all  our  assays  returned 
negative  data.  It  is  also  interesting  that  although  there  has  been  one  report  of  co¬ 
crystallization  of  RhoGDI  with  the  FERM  domain  of  radixin,  a  homologue  of  merlin, 
(Hamada  et  al.,  2001),  this  structure  was  never  published.  This  strongly  suggests  that  the 
crystals  did  not  contain  the  complex,  as  the  original  publication  stated.  Under  the 
circumstances,  we  discontinued  this  research  and  we  focused  on  another  partner  of  merlin 
-  syntenin. 
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The  structure  and  function  of  syntenin 


Although  many  PDZ-containing  proteins  have  multiple  PDZ  domains,  our  structure  of 
the  tandem  of  PDZ  domains  of  syntenin  was  the  first  reported  crystal  structure  to  contain 
more  than  one  PDZ  domain  within  a  contiguous  polypeptide  chain  (Kang  et  al.,  2003b). 
Each  PDZ  domain  in  the  tandem  appeared  capable  of  binding  peptides,  as  these  domains 
deviated  very  little  from  a  classical  PDZ  fold.  One  exception  to  this  is  the  insertion  of  a 
basic  residue  after  the  initial  glycine  of  the  signature  peptide-binding  GLGF  loop.  The 
crystal  structure,  along  with  stability  studies,  strongly  suggested  that  syntenin  has  a 
supramodular  architecture,  and  that  the  mutual  disposition  of  the  two  PDZ  domains  is 
fixed.  This  was  a  novel  concept  at  the  time,  but  soon  other  examples  were  documented 
(Long  et  al.,  2003).  Since  crystal  structures  are  occasionally  biased  due  to  the  crystal 
packing  forces,  we  pursued  this  question  with  a  study  of  the  PDZ-tandem  and  intact 
syntenin  using  a  relatively  novel  NMR  technique  based  on  the  measurements  of  residual 
dipolar  couplings  (RDCs).  We  showed  (manuscript  submitted)  indeed  syntenin  has  a 
stable  supramodular  structure,  but  that  it  is  not  identical  to  the  one  in  the  crystal  structure, 
because  of  the  crystal  packing. 

The  studies  of  syntenin-peptide  complexes  revealed  several  surprises,  and  resulted 
in  observations  whose  significance  extends  beyond  the  structural  biology  of  NF2.  We 
discovered  the  basis  of  degenerate  specificity  in  the  PDZ2  domain  of  syntenin  and  we 
revised  the  current  theory  regarding  the  molecular  recognition  of  peptides  by  the  PDZ 
domains.  This  work  was  featured  on  the  cover  of  the  July  2003  issue  of  the  Cell  Press 
journal  -  STRUCTURE.  Moreover,  the  crystal  structure  of  the  isolated  PDZ2  domain  set 
a  new  record  for  the  highest  precision  in  crystallographic  analysis.  The  ultra-high 
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resolution  structure  of  PDZ2  proved  not  only  to  be  informative  for  the  study  of  the 
mechanism  of  PDZ  peptide  recognition,  but  due  to  the  unprecedented  quality  of  the  data, 
it  set  a  new  standard  for  protein  crystallography. 

Further,  in  the  last  year,  we  solved  several  structures  of  complexes  of  the  PDZ- 
tandem  with  various  peptides,  probing  the  nature  of  the  differences  between  the  two 
domains.  We  characterized  a  new,  non-canonical  mode  of  interaction,  unique  to  the 
PDZ1  domain.  Again,  these  results  significantly  extend  our  understanding  of  the 
structural  biology  of  the  PDZ  domains. 
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The  structure  of  the  FERM  domain  of  merlin,  the 
neurofibromatosis  type  2  gene  product 


Neurofibromatosis  type  2  is  an  autosomal  dominant  disorder  Received  23  August  2001 

characterized  by  central  nervous  system  tumors.  The  cause  of  Accepted  10  December  2001 

the  disease  has  been  traced  to  mutations  in  the  gene  coding  for 

a  protein  that  is  alternately  called  merlin  or  schwannomin  and  PDB  Reference:  merlin  ferm 

is  a  member  of  the  ERM  family  (ezrin,  radixin  and  moesin).  domain,  ih4r,  rih4rsf. 

The  ERM  proteins  link  the  cytoskeleton  to  the  cell  membrane 
either  directly  through  integral  membrane  proteins  or 
indirectly  through  membrane-associated  proteins.  In  this 
paper,  the  expression,  purification,  crystallization  and  crystal 
structure  of  the  N-terminal  domain  of  merlin  are  described. 

The  crystals  exhibit  the  symmetry  of  space  group  with 

two  molecules  in  the  asymmetric  unit.  The  recorded  diffrac¬ 
tion  pattern  extends  to  1.8  A  resolution.  The  structure  was 
solved  by  the  molecular-replacement  method  and  the 
model  was  refined  to  a  conventional  R  value  of  19.3% 

( RiTCC  -  22.7%).  The  N-terminal  domain  of  merlin  closely 
resembles  those  described  for  the  corresponding  domains  in 
moesin  and  radixin  and  exhibits  a  cloverleaf  architecture  with 
three  distinct  subdomains.  The  structure  allows  a  better 
rationalization  of  the  impact  of  selected  disease-causing 
mutations  on  the  integrity  of  the  protein. 


1.  Introduction 

Neurofibromatosis  type  2  (NF2),  first  described  in  1822  by  the 
Scottish  surgeon  Wishart,  is  an  often  devastating  autosomal 
dominant  disorder  affecting  one  in  every  40  000-90  000 
potential  births,  depending  on  geographic  factors  (Evans  et  al , 
1992,  2000;  Gutmann,  2001;  Martuza  &  Eldridge,  1988).  Until 
about  1985,  NF2  was  often  linked  with  neurofibromatosis 
type  1,  also  a  dominant  inherited  disorder,  and  the  two  were 
collectively  referred  to  as  von  Recklinhausen  disease.  Indivi¬ 
duals  affected  by  NF2  develop  central  nervous  system  tumors 
such  as  Schwann  cell  tumors  of  the  eighth  cranial  nerve 
(bilateral  vestibular  schwannomas),  meningiomas  and  epen¬ 
dymomas,  which  although  classified  as  cancers  are  typically 
slow-growing  and  non-malignant.  The  clinical  symptoms  vary 
profoundly  from  a  mild  to  a  very  severe  phenotype,  with 
diagnostic  prevalence  of  the  disease  significantly  lower  than 
birth  incidence  (Evans  et  al ,  2000;  Gutmann,  2001). 

Neurofibromatosis  type  2  is  associated  with  a  homozygous 
inactivation  of  the  NF2  gene.  Located  within  17  exons  in  the 
long  arm  of  chromosome  22,  this  gene  encodes  a  595-residue 
protein  denoted  as  schwannomin  or  merlin  (Rouleau  et  al. , 
1993;  Trofatter  et  al ,  1993).  Alternative  splicing  of  exon  16 
results  in  the  presence  of  another  isoform,  which  differs  only 
in  the  C-terminal  11  residues,  with  important  functional 
consequences  (Sherman  et  al ,  1997).  There  is  convincing 
evidence  that  mutations  inactivating  some  or  all  of  the 
biological  functions  of  merlin,  which  acts  as  a  tumor 
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suppressor  protein,  are  the  causal  factor  behind  the  etiology  of 
NF2.  For  example,  in  schwannoma,  overexpression  of  wild- 
type  NF2  gene  but  not  of  a  mutant  leads  to  growth  suppres¬ 
sion,  impaired  cell  motility,  adhesion  and  spreading  (Gutmann 
et  al ,  1998,  1999;  Sherman  et  al ,  1997).  Furthermore,  mice 
with  targeted  mutations  in  the  NF2  gene  develop  malignant 
tumors  (McClatchey  et  al ,  1998). 

Merlin  is  a  member  of  a  larger  group  of  proteins,  which 
includes  protein  4.1,  talin  and  three  closely  homologous 
proteins  known  collectively  as  ERM,  i.e.  ezrin,  radixin  and 
moesin  (Mangeat  et  al ,  1999;  Tsukita  et  al ,  1994, 1997;  Tsukita 
&  Yonemura,  1997).  The  ERM  proteins  have  no  known 
catalytic  function,  but  are  believed  to  participate  in  signaling 
phenomena  by  providing  a  link  between  the  actin  cyto- 
skeleton  and  the  membrane  (Tsukita  et  al ,  1994).  Like  other 
ERM  proteins,  merlin  contains  three  domains:  the  N-terminal 
domain  (also  denoted  as  the  FERM  domain)  comprising 
approximately  the  first  300  residues,  a  central  coiled-coil 
fragment  and  a  C-terminal  polypeptide  containing  the  last  120 
residues.  The  C-terminal  polypeptide  of  merlin  is  unique 
among  the  ERM  family  members  in  that  it  does  not  contain  an 
actin-binding  motif  (Mangeat  et  al ,  1999;  Thrunen  et  al ,  1998). 
The  molecular  physiology  of  merlin  and  of  the  ERM  proteins 
in  general  involves  intermolecular  or  intramolecular  head- 
to-tail  interaction  between  the  FERM  domain  and  the  C- 
terminal  polypeptide  (Meng  et  al ,  2000;  Nguyen  et  al ,  2001; 
Sherman  et  al ,  1997;  Tsukita  et  al ,  1997).  The  FERM  domain 
of  merlin  has  been  implicated  in  intermolecular  interactions 
with  such  proteins  as  CD44  (Herrlich  et  al ,  2000),  EBP50 
(NHE-RF;  Murthy  et  al ,  1998),  SCHIP-1  (Goutebroze  et  al , 
2000),  HRS  (Scoles  et  al,  2000),  /Jl-integrin  (Obremski  et  al, 
1998)  and  RhoGDI  (Maeda  et  al,  1999).  Whether  or  not  all  of 
these  interactions  are  physiologically  relevant  remains  to  be 
validated,  as  are  the  specific  signaling  pathways  relevant  to 
merlin.  However,  the  regulated  association  of  the  FERM 
domain  of  merlin  with  the  C-terminal  polypeptide  (also 
denoted  C-ERMAD)  mediates  tumor-growth  suppression  in 
normal  cells  (Sherman  et  al,  1997).  Under  normal  conditions 
the  association  between  the  two  domains  is  regulated  by 
phosphorylation  of  the  C-terminal  polypeptide,  although  it  is 
not  clear  what  induces  this  process. 

Recently,  the  molecular  architecture  of  the  ERM  proteins 
has  become  better  understood  owing  to  X-ray  diffraction 
analyses  of  the  FERM  domains  of  radixin  and  moesin.  The 
moesin  domain  structure  was  solved  at  1.9  A  resolution  in 
complex  with  its  partner  C-terminal  polypeptide,  but  with  the 
intervening  coiled-coil  fragment  removed  by  recombinant 
methods  (Pearson  et  al,  2000),  and  was  also  studied  in¬ 
dependently  in  a  form  which  includes  an  extension  into  the 
coiled-coil  region  at  2.7  A  resolution  (Edwards  &  Keep,  2001). 
The  radixin  FERM  domain  was  solved  with  and  without 
bound  inositol-(l,4,5)-triphosphate  (IP3)  at  2.8  and  2.9  A 
resolution,  respectively  (Hamada  et  al,  2000).  In  addition,  a 
more  distantly  related  domain  from  protein  4.1  was  also 
solved  by  X-ray  diffraction  at  2.8  A  resolution  (Han  et  al, 
2000).  These  studies  revealed  that  the  FERM  domains  are 
structurally  very  similar,  with  a  cloverleaf-like  architecture 


consisting  of  three  distinct  subdomains.  The  N-terminal 
subdomain  has  a  ubiquitin-like  fold  and  is  followed  by  a 
subdomain  resembling  an  ^cyl-CoA  binding  protein  and  a 
third  subdomain  reminiscent  of  a  phosphotyrosine-binding 
domain  (PTB)  or  pleckstrin  homology  domain  (PH).  In  the 
structure  of  the  moesin  intramolecular  complex  (Pearson  et 
al,  2000),  the  C-terminal  polypeptide  adopts  an  extended 
meandering  conformation,  which  suggests  that  without  its 
FERM  partner  it  is  unable  to  form  a  stable  tertiary  fold. 

In  spite  of  significant  progress  in  the  studies  of  ERM 
proteins,  the  structure  of  merlin,  the  specific  molecule  asso¬ 
ciated  with  NF2,  has  not  been  described.  It  is  important  to 
stress  that  there  are  critical  functional  differences  between 
merlin  and  its  homologs  and  that  only  merlin  mutations  are 
associated  with  the  neurofibromatosis  phenotype.  Efforts  to 
design  therapeutic  agents  able  to  interact  with  the  FERM 
domain  of  merlin  in  a  way  that  could  relieve  NF2  symptoms 
would  certainly  benefit  from  an  accurate  knowledge  of  the 
molecular  structure  of  merlin  itself.  Here,  we  report  the 
structure  of  the  human  merlin  FERM  domain  (residues  1-313) 
at  1.8  A  resolution.  The  structure  reveals  the  expected 
conserved  cloverleaf  architecture  of  the  FERM  domain  and 
provides  an  additional  rationale  for  the  pathological  effects  of 
the  known  NF2-associated  missense  mutations.  It  also  suggests 
regions  of  the  protein  that  are  critical  for  the  interactions  with 
effectors  and/or  activators  of  merlin. 

2.  Materials  and  methods 

2.1.  Construction  of  FERM  domain  expression  plasmids 

A  merlin  clone  was  purchased  from  American  Tissue  and 
Culture  Collection  (ATCC  106908).  It  contained  the  nucleo¬ 
tide  sequence  corresponding  to  the  merlin  N-terminal  341 
amino  acids.  To  express  this  sequence  using  the  Gateway  gene- 
expression  system  (Life  Technologies)  and  to  introduce  the 
recombinant  TEV  protease  (rTEV)  cleavage  site  between  the 
glutathione  S-transferase  (GST)  tag  and  the  target  protein,  we 
designed  three  primers,  attBl-rTEV  primer  (5'-GGGGAC~ 
AAGTTTGTACAAAAAAGCAGGCTCCGAAAACCTG- 
TATTTTCAGGGC-3'),  rTEV-merlin  primer  (5-TCCGAA- 
AACCTGTATTTTCAGGGCATGGCCGGGGCCATCGC- 
TTCCCGC-3')  and  attB2-merlin  primer  (5'- GGGGACCAC- 
TTTGTACAAGAAAGCTGGGTTCATCGAGCGAGGCC- 
ACGCTGCCGCTCCATCTGCTTTCTATCC-3'). 

A  PCR  product  generated  by  two-step  PCR  (rTEV-merlin 
primer  and  attB2-merlin  first  and  then  attBl-rTEV  primer  and 
attB2-merlin  primer)  was  cloned  into  pDEST15,  a  GST  fusion 
protein  vector,  according  to  the  manufacturer’s  instructions 
and  this  clone  was  named  pDEST15:merlin341.  To  improve 
the  efficiency  of  purification,  we  modified  the  vector  to  include 
a  hexa-His  (His6)  tag  at  the  Nde I  site  in  front  of  GST  sequence 
using  the  primers  5'-TATGTCAGGGCACCATCACCAT- 
CACCATTCTGGGGCTGC-3'  and  S'-TAGCAGCCCCAG- 
AATGGTGATGGTGATGGTGCCCTGACA-3'.  This  vector 
was  denoted  pHisDEST15:merlin341.  Finally,  we  introduced 
a  stop  codon  after  Alai  13  to  eliminate  the  amino  acids 
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extraneous  to  the  FERM  domain,  using  the  primers  5'- 
GAGGAGAAGGAAAGCCTAGTCTTTGGAAGTTCAG- 
CAG-3'  and  5'-CTG  CT  G  A  ACTT  CC  A  A  AG  ACTAGCCTT- 
TCCTTCTCCTC-3'.  This  resulted  in  the  clone 
pHisDEST15:merlin313,  which  was  used  in  all  subsequent 
protein-expression  experiments. 


2.2.  Protein  purification  and  crystallization 

To  overexpress  the  double-tagged  merlin  FERM  domain, 
pHisDEST15:merlin313  was  introduced  into  Escherichia  coli 
BL21  (DE3)  RIL  strain  (Stratagene).  LB  medium  containing 
ampicillin  (50  mg  ml-1)  was  inoculated  using  5%(v/v)  of 
overnight  seed  culture.  After  cultivation  at  310  K  for  3  h, 
1  m M IPTG  was  added  and  cells  were  cultivated  at  295  K  for  a 
further  12  h.  Cells  were  harvested  by  centrifugation  at  5000g 
for  20  min,  resuspended  with  50  m M  Tris-HCl  pH  7.5  (buffer 
A)  and  disrupted  by  sonication  (Sonifier  450,  Branson)  for 
30  s  ml"1.  The  cell  lysate  was  centrifuged  at  26  000#  for  45  min 
and  the  soluble  supernatant  was  applied  to  a  glutathione- 
Sepharose  4B  column  (Amersham  Pharmacia  Biotech).  After 
washing  the  column  with  50  m M  Tris-HCl  pH  8.5,  50  m M 
NaCl,  the  recombinant  protein  was  eluted  with  buffer  B 
(10  m M  glutathione).  The  eluent  was  subjected  to  a  HiPrep 
26/10  Desalting  column  (Amersham  Pharmacia  Biotech) 
equilibrated  with  buffer  A  to  remove  NaCl  and  glutathione. 
The  recombinant  protein  was  digested  using  rTEV  protease 
(Life  Technologies)  at  283  K  in  the  presence  of  0.5  m M  EDTA 
and  1  m M  DTT.  After  digestion,  300  mAf  NaCl  was  added  to 
the  digested  recombinant  protein  solution  and  it  was  passed 
through  a  glutathione  Sepharose  4B  column  again  to  remove 
uncut  full-length  fusion  protein  and  the  His6-GST  tag.  To 
remove  rTEV  protease  and  residual  tag,  10  m M  imidazole  was 
added  to  the  flowthrough  from  the  glutathione  Sepharose  4B 
column  and  this  solution  was  loaded  onto  an  Ni-NTA  column 
(Qiagen)  equilibrated  with  buffer  A  containing  300  m M  NaCl 
and  10  m M  imidazole.  The  flowthrough  of  this  column  was 
concentrated  using  a  Centriprep  YM30  (Amicon),  loaded 
onto  a  Superdex  G75  column  (Amersham  Pharmacia  Biotech) 
and  eluted  with  buffer  A  containing  300  m M  NaCl.  The  frac¬ 
tions  containing  the  merlin  FERM  domain  were  collected  and 
concentrated  using  a  Centriprep  YM30  for  crystallization 
screening.  All  the  purification  steps,  except  the  rTEV  diges¬ 
tion,  were  performed  at  277  K.  The  purified  FERM  domain 
contains  an  additional  glycine  at  the  N- terminus  arising  from 
the  rTEV  recognition  sequence.  After  the  purification,  about 
30  mg  of  pure  protein  was  obtained  from  2.8  1  of  culture. 

After  screening  for  crystallization  conditions  using  Crystal 
Screen  and  ammonium  sulfate  Grid  Screen  (Hampton 
Research),  crystallization  conditions  were  optimized  around 
0.1  M  sodium  cacodylate  pH  6.5  containing  ammonium  sulfate 
and  dioxane.  The  sitting-drop  vapor-diffusion  method  was 
used  for  all  crystallization  trials.  Drops  were  formed  of  3  pi  of 
protein  solution  and  3  pi  of  reservoir  buffer  and  were  overlaid 
with  a  1:1  mixture  of  silicone  and  mineral  oils.  Crystallization 
trays  were  stored  at  294  K.  The  best  crystals  were  obtained 


Table  1 

Data-collection  and  refinement  statistics. 


Values  in  parentheses  refer  to  the  highest  resolution  shell. 


Experimental  data 

Space  group 

Unit-cell  parameters  (A) 
a 

87.02 

b 

89.33 

c 

96.77 

Resolution  (A) 

30-1.80  (1.86-1.80) 

Mosaicity  (°) 

0.69 

Unique  reflections 

68222  (6875) 

Redundancy 

3.6  (3.4) 

Completeness  (%) 

95.4  (97.2) 

^symt 

0.065  (0.622) 

Average  //cr(/) 

16.8  (2.68) 

Reflections  with  I  >  3cr  (%) 

70.8  (30.2) 

Refinement  details 

Resolution  (A) 

5.0-1 .8  (1.847-1.8) 

Reflections  (working) 

66303  (4917) 

Reflections  (test) 

985  (77) 

tfworkt  (%) 

19.3  (26.0) 

Kfreet  (%) 

22.7  (26.8) 

No.  of  waters 

862 

R.m.s.  deviation  from  ideal  geometry 

Bonds  (A) 

0.011 

Angles  (°) 

1.39 

Average  B  factor  (A2) 

Main  chain 

23.5 

Side  chain 

26.1 

Waters 

40.6 

Sulfate 

44.3 

t  R,ym  =  E;(W|/»(/)l/E„WA  iU  or  R(rcc  =  E  |i^*(**0l  -  |Fca,c(^/)l|/ 

Ewtf  \Fob,(hkl)\. 


using  a  5  mg  ml”1  protein  solution  and  a  buffer  containing 
56%  saturated  ammonium  sulfate,  2%  dioxane  and  0.1  M 
sodium  cacodylate. 


2.3.  Data  collection,  structure  solution  and  refinement 

The  crystal  used  for  data  collection  was  briefly  soaked  in  a 
solution  containing  12.5%(v/v)  glycerol  and  56%  ammonium 
sulfate  before  being  transferred  to  24%  glycerol  and  30% 
ammonium  sulfate  and  frozen  by  immersion  in  liquid  nitrogen. 
The  data  were  collected  at  beamline  X9B  at  NSLS  at  a 
wavelength  of  0.920  A  under  cryoconditions  using  an  ADSC 
Quantum  4  CCD.  The  data  were  indexed  and  scaled  using 
HKL2000  (Otwinowski  &  Minor,  1997). 

The  structure  was  solved  by  molecular  replacement  using 
AMoRe  (Navaza,  1994).  The  program  SEAMAN  (Kleywegt, 
1996a)  was  used  to  create  a  search  model  based  on  the  radixin 
structure  (PDB  code  lgc6),  with  serines  substituted  for  all 
non-conserved  residues  larger  than  alanine.  Manual  model 
rebuilding  was  performed  in  O  (Jones  et  al,  1991).  A  combi¬ 
nation  of  CNS  (Briinger  et  al ,  1998)  and  REFMAC  from  the 
CCP4  suite  (Collaborative  Computational  Project,  Number  4, 
1994)  was  used  for  refinement,  with  the  final  refinement 
performed  using  REFMACS  with  default  values  for  target 
stereochemistry  (Murshudov  et  al ,  1997).  Waters  were  added 
using  ARPIwARP  (Perrakis  et  al ,  1999). 
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3.  Results  and  discussion 

3.1.  Crystallization,  data  collection  and  structure  solution 

The  FERM  domain  crystals  belong  to  space  group  P2{L{1\, 
with  unit-cell  parameters  a  =  87.02,  b  =  89.33,  c  =  96.76  A. 
After  two  weeks,  the  average  size  of  the  crystals  was 
0.2  x  0.2  x  0.1  mm  (Fig.  la).  The  volume  of  the  asymmetric 
unit  (188  040  A3)  suggested  the  presence  of  two  molecules, 
with  a  resultant  Matthews  coefficient  of  2.5  A3  Da-1  and  a 
solvent  content  of  51.5%.  Data  with  overall  completeness  of 
95.4%  were  collected  from  a  single  frozen  crystal  and  were 
merged  and  scaled  with  an  /?mergc  of  0.065  (Table  1). 

In  order  to  assess  which  of  the  two  existing  atomic  models 
of  FERM  domains  is  closer  to  merlin,  parallel  molecular- 
replacement  calculations  were  performed  with  a  model  based 
on  the  radixin  structure  at  2.8  A  without  IP3  (PDB  code  lgc7; 
Hamada  et  al ,  2000)  and  a  model  based  on  the  moesin 
structure  (PDB  code  lefl;  Pearson  et  al,  2000).  Although  the 
levels  of  sequence  identity  of  merlin  with  radixin  or  moesin 
are  high  (64  and  63%,  respectively),  a  model  was  constructed 
from  each,  with  non-conserved  residues  larger  than  alanine 
truncated  to  serines.  These  two  polyserine  models  yielded 
molecular-replacement  solutions  that  were  marginally  better 


than  those  obtained  from  the  complete  structures  (data  not 
shown).  With  each  model,  the  molecular-replacement  calcu¬ 
lations  gave  two  solutions,  in  agreement  with  expectations 
based  on  crystal  density  considerations.  By  most  statistical 
criteria,  the  moesin-based  model  provided  the  best  solution 
for  the  rotation  and  translation  function,  but  the  radixin-based 
model  provided  a  better  solution  after  rigid-body  refinement. 
Based  on  this  and  the  slightly  higher  sequence  similarity  with 
radixin,  the  radixin-based  model  was  used  as  the  starting  point 
for  refinement.  During  the  refinement,  non-crystallographic 
symmetry  restraints  were  not  applied,  given  the  relatively  high 
resolution  of  the  data.  A  combination  of  CNS  and  REFMAC 
was  used  to  refine  the  structure,  with  the  final  rounds  of 
refinement  performed  in  REFMAC5.  Maximum-likelihood 
refinement  of  the  model  converged  with  the  statistics  reported 
in  Table  1.  To  determine  the  extent  of  model  bias,  several 
rounds  of  refinement  were  also  performed  using  the  moesin- 
based  model.  This  refinement  was  discontinued  when  we  were 
satisfied  that  the  model  was  not  significantly  biased  by  the 
initial  model  choice. 

3.2.  Quality  of  the  refined  atomic  model 


id) 


The  final  model  consists  of  two  mole¬ 
cules  of  the  FERM  domain,  861  water 
molecules  and  six  sulfate  ions.  The  first 
19  amino  acids  in  each  of  the  FERM 
domain  molecules  are  not  visible  in  the 
electron  density.  The  refined  structure 
conforms  to  standard  protein  stereo¬ 
chemistry,  with  an  r.m.s.  deviation  from 
ideal  bond  lengths  of  0.011  A  and  only 
five  of  the  588  residues  of  the  structure 
falling  into  generously  allowed  regions 
of  the  Ramachandran  plot  (Laskowski  et 
al ,  1993).  Only  a  few  side  chains  are  not 
entirely  contained  within  the  electron 
density  of  a  2 F0  —  Fc  c^-weighted  map 
at  la  (Figs,  lc  and  Id).  Each  monomer 
contains  one  cw-proline.  A  limited 
number  of  residues  exhibit  static 
disorder,  but  alternate  conformations 
were  not  refined  at  this  point. 

The  main-chain  temperature  factors 
range  from  12.0  to  54.3  A2,  with  average 
values  of  22.6  and  24.4  A2  for  chains  A 
and  B ,  respectively.  This  similarity  is 
easily  rationalized  by  similar  packing  of 
both  molecules  in  the  crystal  lattice.  The 
temperature  factors  are  generally  higher, 
as  expected,  in  external  loops.  The 
exception  to  this  is  strand  )65C,  a  term¬ 
inal  strand  in  a  /1-sheet,  which  is  stabi¬ 
lized  by  hydrogen  bonds  on  one  side 
only.  The  low  B  values  reported  in  this 
study  reflect  the  superior  quality  of  the 
atomic  model.  This  is  particularly 


(c) 

Figure  1 

(a)  Typical  crystals  of  the  FERM  domain  of  merlin.  ( b )  The  FERM  domain  of  merlin  is  shown  in  a 
ribbon  representation  color-ramped  from  blue  to  red.  The  subdomains  are  labeled  as  they  are 
described  in  the  text  and  the  main  secondary-structural  elements  are  labeled,  (c)  Typical  electron 
density  is  shown  contoured  at  1.2cr  in  a  2mFobs  —  DFca jc  map.  ( d )  The  electron  density  for  the 
region  with  the  highest  B  factors  is  shown  contoured  at  1.0  a  in  a  2mFohs  —  DF^ c  map.  Figs.  1(6),  2, 
3,  4  and  5  were  produced  using  MOLSCR1PT  (Kraulis,  1991)  and  Raster3D.  Figs.  1(c)  and  1(d) 
were  produced  with  BOBSCRIPT  (Esnouf,  1997)  and  Raster3D  (Merritt  &  Bacon,  1997). 
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striking  when  these  values  are  compared  with  those  found  in 
the  radixin  models  (~64  A2  for  the  main  chain)  or  the  isolated 
FERM  domain  of  moesin  (PDB  code  le5w;  ~70  A2  for  main 
chain).  While  this  discrepancy  may  stem  from  limited  resolu¬ 
tion  in  those  studies,  it  is  also  likely  that  the  refinement 
protocols  may  not  have  been  optimal.  For  example,  some  of 
the  loops  in  the  moesin  FERM  domain  (PDB  code  le5w)  have 
B  values  in  excess  of  140  A2,  which  corresponds  to  an 
unrealistic  value  of  the  mean-square  displacement  (r2)  of 
nearly  2.0  A2. 

The  r.m.s.  distance  between  the  C*  atoms  of  the  two 
molecules,  following  least-squares  overlap,  is  0.77  A.  Only  a 
handful  of  residues,  mostly  solvent  exposed, 
have  different  side-chain  conformations  in 
the  two  monomers.  The  segments  with 
larger  differences  correlate  with  areas  of 
higher  temperature  factors  and  areas  that 
are  involved  in  crystallographic  contacts. 

The  region  with  the  highest  discrepancy  is 
the  N-terminus  of  a3 B,  with  the  preceding 
loop  and  the  3i0-helix.  The  average  r.m.s. 
coordinate  error  derived  from  the  program 
SIGMAA  in  the  CCP4  suite  is  0.11  A2. 

3.3.  The  overall  tertiary  architecture  and 
comparisons  with  moesin  and  radixin 

The  tertiary  structure  of  the  FERM 
domain  of  merlin  is  very  close  to  that  of  the 
homologous  domains  of  moesin  and  radixin 
(Edwards  &  Keep,  2001;  Hamada  et  al, 

2000;  Pearson  et  al ,  2000).  The  polypeptide 
chain  folds  into  three  clearly  identifiable 
subdomains,  each  with  similarities  to  known 
single-domain  proteins.  These  three  struc¬ 
tural  elements  were  denoted  differently  for 
the  moesin  and  radixin  structures  and  we 
here  choose  to  follow  the  latter  convention, 
according  to  which  the  merlin  fragment 
encompassing  residues  20  to  approximately 
100  is  defined  as  A ,  that  including  residues 
101-215  is  denoted  B,  and  the  third  frag¬ 
ment,  residues  216-313,  is  denoted  C 
(Fig.  lb).  As  noted  by  others  (Hamada  et 
al ,  2000;  Pearson  et  al,  2000),  the  A 
subdomain  has  a  fold  reminiscent  of 
ubiquitin,  B  is  similar  to  the  acyl-CoA 
binding  protein  and  C  exhibits  a  fold  found 
in  such  signaling  domains  as  PTB,  PH  and 
EVH1.  Merlin  is  unique  in  that  it  has  an 
additional  N-terminal  extension  of  19 
amino  acids  compared  with  both  radixin 
and  moesin.  It  has  been  suggested  recently 
based  on  limited  proteolysis  experiments 
that  this  fragment  is  disordered  (Brault  et 
al,  2001)  and  our  structure  fully  confirms 
this  prediction.  The  first  amino  acid  clearly 


defined  in  the  electron  density  is  Lys20.  It  is  natural  to  spec¬ 
ulate  that  this  portion  of  merlin  is  disordered  in  solution. 
However,  since  this  region  has  been  shown  to  be  necessary  for 
the  proper  functioning  of  merlin  and  is  implicated  in  actin 
binding  (Brault  et  al,  2001),  it  is  also  possible  that  it  becomes 
ordered  as  merlin  binds  to  some  effector  target. 

Least-squares  fitting  of  the  merlin  FERM  domain  onto 
radixin  and  moesin  reveals  that  the  mutual  disposition  of  the 
three  subdomains  is  relatively  well  preserved  in  all  three 
proteins,  although  concerted  shifts  of  entire  subdomains  are 
noticeable  albeit  small.  Such  rearrangements  affect  global 
comparisons  of  r.m.s.  positional  differences,  as  the  latter  are 


(c) 


Figure  2 

Stereoviews  of  the  superpositions  of  the  individual  FERM  subdomains  of  merlin,  radixin  and 
moesin:  subdomain  A  (top),  subdomain  B  (center)  and  subdomain  C  (bottom).  In  all  figures, 
merlin  is  blue,  radixin  is  red,  lefl  moesin  is  green  and  le5w  moesin  is  gold. 
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likely  to  reflect  both  local  changes  and  global  rearrangements.  64  and  69  C“  atoms  within  a  1.5  A  distance.  The  discrepancies 
Superpositions  of  the  individual  subdomains  of  the  FERM  occurred  consistently  around  residue  31,  the  loop  comprising 

domains  of  merlin,  moesin  and  radixin  are  shown  in  Fig.  2.  amino  acids  67-72  and  residue  88.  For  domain  B ,  the  results 

To  gain  a  better  understanding  of  the  similarities  and  were  0.45,  0.65  and  0.51  A,  with  102,  91  and  103  atoms 

differences  between  the  FERM  domains,  the  program  included,  respectively.  There  is  a  consistent  discrepancy,  which 

LSQMAN  (Kleywegt,  1996b)  was  used  to  superpose  the  includes  the  loop  around  residue  177.  Finally,  subdomain  C 

FERM  domains  and  their  individual  subdomains.  Owing  to  exhibits  r.m.s.  differences  of  0.61, 0.67  and  0.70  A,  respectively, 

the  high  degree  of  similarity  of  the  two  monomers  of  merlin  with  86,  87  and  77  atoms  included  and  a  consistent  departure 

and  for  .simplicity,  values  for  the  A  monomer  are  presented  of  residues  288-291  from  the  average.  None  of  the  differences 

here.  When  the  entire  ordered  portion  of  the  FERM  domain  are  of  a  magnitude  which  would  suggest  a  significant  biological 

of  merlin  (294  residues)  is  fitted  onto  radixin  (PDB  code  lgc7)  effect  and  some  can  be  easily  rationalized  in  terms  of  crystal 

or  either  of  the  deposited  FERM  domains  of  moesin  (PDB  contacts. 

codes  lefl,  which  corresponds  to  the  1.9  A  study,  and  le5w,  The  high  resolution  of  the  present  study  permits  a  detailed 
the  2.7  A  resolution  model),  the  corresponding  values  of  the  analysis  of  the  interfaces  between  the  three  subdomains, 

r.m.s.  positional  differences  between  the  C*  atoms  range  from  including  contributions  from  the  ordered  solvent.  Two  large 

1.8  to  2.0  A.  However,  when  the  subdomains  are  fitted  onto  interfaces  contribute  to  the  integrity  of  the  tertiary  structure 

the  targets  separately  and  when  a  few  outliers  with  distances  of  the  FERM  domain.  The  first  involves  residues  from 

above  3.5  A  are  excluded,  the  values  fall  dramatically  to  subdomains  A  and  C.  The  C-terminal  long  helix  of  the  C 

approximately  0.92  A  for  moesin  and  0.8  A  for  radixin.  subdomain  (residues  289-313)  packs  against  two  loops  of 

Subdomain  A,  when  fitted  onto  lgc7,  lefl  and  le5w,  showed  subdomain  A  containing  residues  69-76  and  99-103.  The  face 

r.m.s.  differences  of  0.65, 0.55  and  0.65  A,  respectively,  with  70,  of  the  helix  involved  in  this  interface  is  largely  non-polar  and 

contains  Leu306,  Leu299  and  Leu295;  the  A 
subdomain  contributes  PhelOO,  Trp74  and 
Val72.  Numerous  water  molecules  flank  this 
interface;  however,  they  do  not  seem  to  be 
an  integral  part  of  the  interface  but  rather 
form  a  typical  hydration  shell.  This  specific 
interface  is  different  in  both  radixin  and 
moesin  because  of  the  single  amino-acid 
deletion  which  is  found  in  merlin  in  the  loop 
comprising  residues  66-72  and  confers  a 
conformational  change.  As  a  result,  the 
loop  packs  significantly  closer  to  the 
N-terminus  of  the  helix  in  the  C  subdomain, 
probably  because  of  a  salt  bridge  formed 
between  Asp70  on  one  side  and  Arg291  and 
Lys289  on  the  other.  Both  moesin  and 
radixin  lack  an  aspartate  in  this  position  and 
instead  contain  bulky  aromatics  (Phe  or 
Tyr)  which  push  the  loop  away  from  the  C 
subdomain.  The  significant  difference  in  the 
local  structure  of  this  loop,  as  well  as  the 
dramatically  different  amino-acid  sequence 
in  this  region,  suggest  that  this  epitope  may 
be  involved  in  protein-protein  interactions 
unique  to  merlin. 

Another  interface  is  found  between 
subdomains  A  and  B  which,  in  addition  to 
the  intervening  loop,  interact  via  the  first 
helix  of  the  B  subdomain,  which  is  wedged 
between  the  two  subdomains  and  contri¬ 
butes  several  hydrophophic  side  chains  such 
Figure  3  as  Ilel26,  Vall22,  Phell8  and  Phell9. 

Crystal  packing  of  merlin,  (a)  Stereoview  of  the  packing  in  the  unit  cell  showing  the  similarities  There  are  also  direct  hydrogen  bonds 

of  the  packing  of  the  A  and  C  subdomains  for  the  two  merlin  monomers,  (b)  Stereo^ew  of  the  between  ^  subdomains.  ^  interface  is 

dimer  interface.  A  salt  badge  positions  the  side  chains  of  Glut 36  and  Argl87  such  that  they 

pack  against  Trpl91.  Glul94  hydrogen  bonds  with  amide  N  atoms  at  the  N-terminus  of  a  central  closely  packed  and  lacks  any  internal  water 

helix  of  subdomain  B.  In  both  figures  monomer  A  is  green  and  monomer  B  is  blue.  molecules. 
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There  is  only  a  small  interface  between  subdomains  C  and 
B.  Leu250,  located  in  the  loop  between  f$3 C  and  /J4C,  fits  into 
a  small  pocket  formed  by  Tyrl32,  Glu215  and  Met216.  Overall, 
however,  the  relative  positions  of  these  two  subdomains  and 
the  entire  FERM  ‘cloverleaf’  are  defined  by  the  contacts 
described  above  and  the  covalent  linkages. 

The  crystals  of  the  FERM  domain  of  merlin  contain  two 
molecules  in  the  asymmetric  unit  related  by  a  non-crystallo- 
graphic  twofold  axis  running  nearly  parallel  to  the  crystallo¬ 
graphic  b  axis  and  between  the  B  subdomains  of  adjacent 
molecules  (Fig.  3a).  This  packing  is  consistent  with  a  strong 
maximum  in  the  native  Patterson  seen  at  0.0,  0.5,  0.094  (data 
not  shown),  indicating  translational  non-crystallographic 
symmetry.  The  interface  between  these  two  molecules, 
although  small  (422  A2),  is  quite  intricate  and  involves  helices 
a2B  and  ctAB  in  a  symmetric  arrangement  (Fig.  3b).  The  two 
helices  in  each  molecule  interact  via  a  salt  bridge  involving 
Glul36  and  Argl87.  Furthermore,  the  side  chains  of  Argl87 
from  the  two  molecules  pack  tightly  against  each  other  and  are 
flanked  on  each  side  by  the  indole  rings  of  the  two  Trpl91 
residues.  These  in  turn  pack  against  Glul36  in  the  neighboring 
molecule,  further  stabilizing  this  contact.  At  each  end  of  this 
interface,  Glul94  caps  otherwise  non-bonded  backbone 
amides  of  residues  136  and  137  at  the  N-terminus  of  an 
a-helix,  so  that  each  Oe  atom  accepts  a  hydrogen  bond  from 
one  amide  N  atom.  This  elegant  cap  stabilizes  the  incipient 
helix  immediately  downstream  of  a  diprolyl  peptide.  Finally, 
two  symmetrical  pairs  of  water  molecules,  each  coordinated  by 
at  least  three  hydrogen-bonding  partners,  add  to  the  stability 


Figure  4 

Missense  mutations  of  merlin.  The  distribution  of  NF2-assocated 
missense  mutations  is  shown  by  the  presence  of  a  sphere  at  the  C". 
Red  spheres  represent  a  subtitution  mutation,  purple  a  deletion  and 
green  an  insertion.  Sites  of  mutations  are  labeled. 


of  this  contact.  We  note  that  in  moesin  this  general  area  is 
involved  in  the  binding  of  the  C-terminal  polypeptide  and  that 
many  of  the  residues  participating  in  this  interface  are 
conserved  among  the  FERM  domains,  all  of  which  suggests  a 
functional  significance. 

Other  crystal  contacts  also  contribute  to  the  stability  of  the 
lattice.  Residues  30-36  in  subdomain  A  of  molecule  1  interact 
with  two  loops  of  subdomain  C  in  an  adjacent  molecule  1,  i.e. 
residues  280-282  and  252-255.  As  both  subdomains  are 
roughly  at  the  same  x  and  y  coordinates,  this  arrangement 
forms  a  chain  which  runs  along  the  c  axis  of  the  crystal. 
Interestingly,  molecule  2  shows  similar  contacts  and  thus  the 
symmetry  of  the  two  molecules  is  broken  by  the  different 
packing  of  B  subdomains  against  C  subdomains  of  molecules 
in  the  next  layer  along  the  c  axis.  The  carboxyl  terminus  of  the 
C  subdomain  of  molecule  1  is  buried  in  the  loop  which 
includes  residues  169-180  of  subdomain  B  of  molecule  2, 
whereas  the  carboxyl  terminus  of  molecule  2  is  wedged 
between  the  two  B  subdomains.  The  structural  differences 
observed  between  B  subdomains  in  moesin,  radixin  and 
merlin  may  reflect,  at  least  in  part,  the  impact  of  this  crystal 
contact. 


3.4.  The  NF2-associated  missense  mutations 

The  structure  of  moesin  has  been  used  previously  to  analyze 
the  structural  consequences  of  NF2-associated  mutations  in 
merlin.  However,  given  that  the  current  study  focuses  on  the 
NF2  causal  gene  product  itself,  it  is  proper  to  address  this  issue 
again.  Although  the  most  devastating  mutations  of  merlin  are 
nonsense  mutations  that  cause  a  premature  termination  of 
merlin,  a  number  of  missense  mutations  are  associated  with 
milder  cases  of  the  disease  (Gutmann  et  al ,  1998).  As  can  be 
seen  in  Fig.  4  and  Table  2, 20  of  these  mutations  are  distributed 
throughout  the  FERM  domain,  with  a  slightly  higher 
frequency  of  mutations  in  the  A  subdomain.  Most  of  the 
NF2-assocated  mutations  occur  at  sites  that  are  conserved 
between  merlin  and  other  FERM  domains,  with  six  of  these 
(Leu46,  Phe62,  Leu64,  Lys79,  Phe96  and  Ile273)  completely 
conserved  among  the  ERM  proteins  and  protein  4.1.  While  a 
number  of  the  NF2-associated  mutations  are  likely  to  cause 
critical  disruption  in  the  packing  of  the  respective  subdomain, 
the  majority  of  the  mutations  may  impact  the  subdomain 
interfaces.  This  suggests  that  the  specific  architecture  of  the 
cloverleaf  is  crucial  for  the  normal  function  of  the  protein. 
None  of  the  mutations  occur  at  the  surface  interacting  with  the 
C-terminal  polypeptide  of  merlin,  as  predicted  from  the 
structure  of  the  moesin  complex. 

The  subdomain  interface  that  is  affected  by  the  largest 
number  of  NF2-assocated  mutations  is  the  AB  interface. 
Phe62  is  directly  involved  in  the  AB  interface  and  the  muta¬ 
tion  of  this  residue  to  a  serine  removes  part  of  the  hydro- 
phobic  interaction  between  these  two  subdomains.  The  LI  171 
mutation  in  subdomain  B  is  also  found  at  the  hydrophobic 
interface  between  these  two  subdomains.  The  insertion  of  a 
leucine  in  subdomain  A  after  residue  49  may  also  affect  the 
AB  interface  by  altering  the  conformation  of  the  subsequent 
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Table  2 

NF2-associated  mutations. 

HC,  hydrophobic  core.  Suk4,  subdomain  A;  subB,  subdomain  B ;  subC,  subdomain  C;  ins,  insertion;  del,  deletion. 


Mutation 

Structural  consequence 

Phenotype 

Reference 

Subdomain  A  mutations 

E38V 

Decreased  solubility  or  impaired  interactions 

Mild  NF2 

Parry  et  al  (1996) 

W41C 

Side-chain  packing  in  subv4 

Mild  NF2 

Welling  et  al  (1996) 

L46R 

HC  of  sub.4 

Meningiomas 

Merel  et  al.  (1995) 

ins  49L 

HC  of  sub/1  +  AB  hydrogen-bond  loss 

Mild  NF2 

Ruttledge  et  al  (1996) 

F62S 

AB  hydrophobic  interface 

Mild  or  severe  NF2 

Scoles  et  al.  (1996) 

L64P 

HC  of  subA 

Not  reported 

Xu  &  Gutmann  (1998) 

M77V 

See  text 

Intermediate  NF2 

Evans  et  al.  (2000) 

K79E 

See  text 

Schwannomas 

Sainz  et  al.  (1994) 

del  F96 

HC  of  $ubA 

Severe  NF2 

MacCollin  el  al.  (1994) 

Subdomain  B  mutations 

E106G 

BC  packing  (salt-bridge  loss  in  subB) 

Severe  NF2 

Bourn  et  al.  (1994) 

LI  171 

AB  hydrophobic  interface 

Meningiomas 

De  Vitis  et  al.  (1996) 

del  FI  18  and/or  AB  hydrophobic  interface  +  HC  of  subB 

Severe  NF2 

Bourn  et  al.  (1995) 

del  F119 

L141P 

Breaks  helix  and  disrupts  fold  of  subB 

Not  reported 

Unpublished! 

del  Q178 

See  text 

Severe  NF2 

Kluwe  et  al  (2000) 

G197C 

Unfavorable  conformation  in  loop 

Mild  NF2 

Welling  etal  (1996) 

Subdomain  C  mutations 

V219M 

BC  hydrophobic  interface 

Meningiomas 

Merel  et  al.  (1995) 

N220Y 

AC  hydrophobic  interface 

Mild  NF2 

Ruttledge  et  al  (1996) 

L234R 

HC  of  subC 

Severe  NF2 

Jacoby  et  al  (1999) 

E270G 

Loss  of  salt  bridge  disrupts  subB 

Severe  NF2 

Kluwe  et  al  (1998) 

t  Unpublished  mutation  found  at  http://neuro-triaIsl.mgh.harvard.edu/nf2. 

loop.  This  loop  facilitates  four  inter-subdomain  hydrogen 
bonds,  with  the  extended  side  chains  of  Glu58  in  subdomain  A 
and  Glnlll  in  subdomain  B  hydrogen  bonding  with  the 
backbone  of  the  adjacent  subdomain.  Furthermore,  deletions 
of  Phell8  and  Phell9,  together  or  individually,  were  also 
found  in  NF2  patients.  These  residues  are  contained  in  alB 
and  further  underscore  the  functional  importance  of  the  AB 
interface. 

Several  NF2-assocated  mutations  affect  the  interfaces 
between  domain  C  and  the  other  two  subdomains.  Two 
mutations,  V219M  and  N220Y,  are  located  at  the  N-terminal 
strand  of  subdomain  C.  Their  side  chains  point  in  opposite 
directions,  with  the  side  chain  of  219  pointing  towards  the  BC 
interface  and  the  side  chain  of  220  in  the  direction  of  the  AC 
interface.  Introduction  of  a  bulkier  side  chain  at  either  site 
may  disrupt  the  respective  interface.  The  mutation  E106G 
removes  a  side  chain  involved  in  a  salt  bridge  with  Lys209, 
possibly  allowing  subdomain  B  to  rotate  closer  to  subdomain 
C.  In  other  ERM  proteins  a  serine  or  alanine  is  found  in  this 
position,  making  this  interaction  unique  to  merlin. 

The  majority  of  the  other  NF2-assocated  mutations  are 
most  likely  to  disrupt  local  packing  within  their  respective 
subdomains.  L46R  and  L234R  introduce  a  large  charged 
residue  in  the  hydrophobic  core  of  subdomains  A  and  C, 
respectively.  The  L64P  substitution  and  the  deletion  of  Phe96 
would  create  cavities  in  the  hydrophobic  core  of  subdomain  A, 
as  well  as  disrupt  the  local  secondary  structure  surrounding 
these  residues.  The  G197C  substitution  occurs  at  the  loop 
between  a2B  and  a3B ,  which  requires  a  backbone  confor¬ 
mation  that  is  unfavorable  for  cysteine  {(p  =  100,  is  =  —13°)  and 
may  decrease  the  solubility  of  the  protein  by  exposing  the  side 
chain  to  solvent.  L141P  may  destabilize  subdomain  B  by 


inserting  a  proline  into  the  middle 
of  one  of  the  central  helices  {alB). 
Although  this  residue  is  not  on  the 
side  of  the  helix  that  points 
towards  u4B,  it  is  approximately  at 
the  position  where  these  two 
central  helices  cross  each  other. 
The  E270G  mutation  is  likely  to 
destabilize  the  C  subdomain  and 
alter  one  of  the  potential  effector¬ 
binding  sites  (see  below). 

Several  of  the  mutations  do  not 
clearly  fall  into  the  categories  of 
disrupting  subdomain  interfaces  or 
subdomain  tertiary  structure.  One 
such  mutation  is  K79E,  which  is  at 
the  end  of  ct4A.  This  charge- 
reversing  mutation  is  very  likely  to 
cause  the  formation  of  a  salt  bridge 
with  the  neighboring  Lys76.  Both 
of  these  lysines  are  conserved 
among  ERM  proteins.  In  the 
merlin  structure  Lys76  is  hydrogen 
bonded  to  iyr66  in  a3 A,  which  is 
also  conserved  in  the  ERM  family. 
However,  in  the  radixin  structure  the  homologous  lysine 
extends  outward  and  interacts  with  the  IP3.  Although  as  yet 
there  is  no  direct  evidence  that  merlin  binds  inositol  phos¬ 
phates,  almost  all  of  the  residues  responsible  for  the  binding  of 
IP3  in  the  radixin  structure  are  either  conserved  in  merlin  or 
replaced  with  functionally  equivalent  amino  acids.  The  charge 
reversal  caused  by  the  K79E  mutation  would  most  likely 
prevent  any  inositol  phosphates  from  binding  to  this  pocket. 

The  potential  effect  of  mutations  of  residues  with  solvent- 
exposed  side  chains  is  less  clear.  The  side  chain  of  Met77, 
mutated  to  a  valine  in  at  least  one  NF2  case,  packs  against 
Phe47  and  the  mutation  may  create  a  destabilizing  solvent- 
accessible  depression.  Similarly,  the  substitution  of  Glu38  for  a 
valine  is  in  a  solvent-exposed  region.  Although  this  substitu¬ 
tion  is  sterically  accommodated,  it  would  place  a  hydrophobic 
residue  on  the  surface  of  the  protein,  possibly  substantially 
decreasing  the  solubility  of  merlin.  It  is  noteworthy  that  the 
same  type  of  substitution  in  hemoglobin  causes  sickle-cell 
anemia.  The  nearby  mutation  of  W41C  would  affect  the  local 
packing  of  side  chains  in  the  area  surrounding  Glu38.  The 
deletion  of  Glnl78  is  discussed  below. 


3.5.  Other  functional  implications 

The  apparent  differences  in  the  biological  properties  of  the 
various  members  of  the  ERM  family  call  for  a  careful  analysis 
of  their  respective  molecular  models.  It  has  been  suggested 
recently  that  the  FERM  domain  in  complex  with  the 
C-terminal  polypeptide  is  in  a  ‘dormant’  state  and  that  its 
biological  inertness  is  a  product  of  the  occlusion  of  the  rele¬ 
vant  epitopes  and  conformational  differences  (Edwards  & 
Keep,  2001).  This  suggestion  is  based  on  the  2.7  A  analysis  of 
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the  structure  of  the  uncomplexed  FERM  domain  of  moesin 
and  on  its  comparison  with  the  structure  of  the  complexed 
moesin  at  1.9  A  resolution  (Pearson  et  al ,  2000).  In  particular, 
the  loop  encompassing  residues  260-264  (276-280  in  merlin) 
was  found  to  differ  significantly  between  the  two  models.  We 
note,  however,  that  the  cloverleaf-like  fold  has  some  intrinsic 
flexibility  made  possible  by  the  interfaces  between  sub- 
domains.  Crystal  contacts  are  sufficient  to  force  minor 
distortions,  but  individual  domains  remain  nearly  identical 
within  experimental  error  in  their  tertiary  fold.  Although  the 
276-280  loop  in  our  structure  resembles  the  conformation 
described  by  Edwards  &  Keep  (2001),  we  believe  that  this 
does  not  necessarily  constitute  proof  that  the  difference  is 
caused  by  the  binding  of  the  C-terminal  fragment. 

The  distribution  of  residues  conserved  in  moesin,  radixin 
and  ezrin  but  not  in  merlin  can  shed  light  onto  the  origin  of  the 
functional  differences  between  merlin  and  the  ERM  proteins. 
There  are  72  such  residues  and  an  additional  19  are  found  in 
two  of  the  three  ERM  proteins  but  not  in  merlin.  Almost  all 
are  located  at  the  surface  of  the  protein,  although  they  are  not 
evenly  distributed  over  the  surface  (Fig.  5).  Of  these  91  resi¬ 
dues  unique  to  merlin,  31  result  in  a  change  in  the  surface 
electrostatics.  Relatively  few  affect  epitopes  involved  in  the 
binding  of  the  C-terminal  polypeptide.  The  majority  of  the  91 
residues  are  clustered  in  three  patches,  two  of  which  are 
roughly  at  a  tip  of  the  cloverleaf.  One  patch  is  located  in  each 
subdomain  and  therefore  the  patches  will  be  described  as 
patches  A ,  B  and  C.  These  patches  are  likely  to  interact  with 
effectors  or  activators  of  merlin. 

Patch  C  is  at  the  C-terminal  end  of  subdomain  C  and 
includes  resides  /55C-/J7C  and  the  beginning  of  alC.  All  of  the 
merlin-specific  residues  in  this  area  have  their  side  chains 
exposed  to  the  solvent  and  four  of  them,  located  on  the  face  of 
the  second  fi-sheet  in  this  subdomain,  involve  a  charge  change 
from  the  ERM  consensus  sequence.  Glu270  and  Lys284, 
mentioned  above  in  the  context  of  the  E270G  mutations,  both 


A 


Figure  5 

The  locations  of  the  residues  unique  to  merlin  are  shown  in  green  on  a 
blue  space-filled  model.  The  arrows  point  to  the  patches  described  in  the 
text.  In  this  figure,  the  C-terminal  polypeptide  of  moesin  has  been  roughly 
positioned  on  merlin  to  indicate  where  the  FERM  and  C-terminal 
polypeptide  interaction  is  most  likely  to  occur  in  merlin.  The  IP3  of 
radixin  is  also  included.  The  image  on  the  left  is  in  the  same  orientation  as 
Fig.  1(b). 


constitute  a  charge  change  from  the  other  ERM  members  and 
are  located  in  patch  C. 

The  second  patch  of  residues  unique  to  merlin  is  found  near 
the  tip  of  subdomain  A  and  includes  residues  found  in  the 
distal  ends  of  fil  A  and  pi  A,  in  the  following  loop  and  in  the 
N-terminal  end  of  alA.  Ezrin  has  been  shown  to  contain  an 
actin-binding  site  in  this  area  (Martin  et  al ,  1997).  Although 
the  overall  net  charge  of  the  region  is  unchanged  from  the 
ERM  consensus  sequence,  the  local  electrostatic  footprint  is 
altered  by  the  addition  of  two  acidic  and  two  basic  residues, 
making  it  unlikely  that  this  serves  as  an  actin-binding  site  in 
merlin.  Moreover,  merlin  has  been  shown  to  contain  an  actin- 
binding  site  within  the  first  27  residues,  19  of  which  are  not 
found  in  the  ERM  proteins  (Brault  et  al,  2001).  The  fact  that 
the  E38V  and  W41C  mutations  are  included  in  patch  A  makes 
it  more  likely  that  the  effects  of  this  protein  are  manifested  by 
impairing  the  ability  of  merlin  to  bind  to  effectors  or  activa¬ 
tors. 

The  subdomain  B  patch  contains  the  beginning  of  a4B  and 
the  loop  that  precedes  it.  This  region  has  been  called  the  ‘Blue 
Box’  in  the  Drosophila  homolog  of  merlin  and  has  been  shown 
to  be  vital  for  the  protein’s  function  (LaJeunesse  et  al,  1998). 
A  comparison  of  merlin  with  moesin  complexed  with  its 
C-terminal  fragment  reveals  that  the  Blue  Box  is  adjacent  to 
the  loop  between  the  A  and  B  helices  of  the  C-terminal 
polypeptide.  The  residues  of  that  fragment  that  contact  the 
Blue  Box  are  not  conserved  between  merlin  and  other  ERM 
proteins;  thus,  the  molecular  surface  that  covers  the  most 
extended  part  of  the  C-terminal  polypeptide  and  the  flanking 
region  of  merlin  is  different  from  the  corresponding  regions  of 
ezrin,  radixin  and  moesin.  This  could  explain  why  the  activa¬ 
tion  of  merlin  is  not  coincident  with  any  of  these  proteins.  This 


Figure  6 

Electrostatic  potentials  generated  in  GRASP  (Nicholls  et  al,  1991)  are 
shown  for  merlin  (left),  radixin  (middle)  and  moesin  (lefl)  (right).  The 
top  views  are  in  the  same  orientation  as  in  Fig.  1(6)  and  each  successive 
image  down  the  figure  has  been  rotated  90°  forward. 
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is  also  supported  by  the  NF2  phenotype  associated  with  the 
deletion  of  Glnl78  located  in  the  Blue  Box  region.  Glnl78  is 
conserved  among  the  ERM  members  but  not  in  merlin  and  is 
adjacent  to  where  the  most  extended  part  of  the  C- terminal 
polypeptide  would  be  as  judged  by  the  moesin  complex 
structure.  The  nature  of  this  loop  would  lead  one  to  believe 
that  this  loop  could  rearrange  itself  to  accommodate  this 
deletion  without  too  much  difficulty;  however,  this  mutation 
leads  to  a  severe  NF2  phenotype  (Kluwe  et  al ,  2000). 

Although  the  patches  of  residues  unique  to  merlin  create 
local  epitopes,  the  overall  molecular  surface  and  electrostatic 
potential  of  merlin  is  similar  to  that  of  other  members  of  the 
ERM  family.  The  largest  exception  to  this  is  the  AB  interface 
(Fig.  6).  This  cleft  is  much  more  electronegative  than  in  the 
other  ERM  proteins.  It  is  interesting  to  note  that  this  is  the 
surface  that  is  affected  by  many  of  the  NF2-associated 
missense  mutations  and  is  roughly  flanked  by  patches  A  and  B. 
This  leads  one  to  speculate  that  this  region  is  crucial  for  the 
interaction  of  merlin  with  effectors  or  activators. 

4.  Conclusions 

We  have  described  the  structure  of  the  FERM  domain  of 
merlin  at  1.8  A  resolution,  the  highest  resolution  to  date  for 
any  of  the  FERM  proteins.  As  expected,  the  structure  is 
similar  to  those  of  the  respective  domains  in  radixin  and 
moesin,  but  also  exhibits  interesting  differences  which  may 
have  functional  implications.  This  work  sets  the  stage  for  more 
detailed  analysis  of  structure-function  relationships  in  merlin, 
with  the  aim  of  designing  ways  of  either  subduing  or  elim¬ 
inating  the  devastating  symptoms  of  neurofibromatosis  type  2. 
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Summary 

Syntenin,  a  33  kDa  protein,  interacts  with  several  cell 
membrane  receptors  and  with  merlin,  the  product  of 
the  causal  gene  for  neurofibromatosis  type  II.  We  re¬ 
port  a  crystal  structure  of  the  functional  fragment  of 
human  syntenin  containing  two  canonical  PDZ  do¬ 
mains,  as  well  as  binding  studies  for  full-length  syn¬ 
tenin,  the  PDZ  tandem,  and  isolated  PDZ  domains.  We 
show  that  the  functional  properties  of  syntenin  are  a 
result  of  independent  interactions  with  target  pep¬ 
tides,  and  that  each  domain  is  able  to  bind  peptides 
belonging  to  two  different  classes:  PDZ1  binds  pep¬ 
tides  from  classes  I  and  III,  while  PDZ2  interacts  with 
classes  I  and  II.  The  independent  binding  of  merlin  by 
PDZ1  and  syndecan-4  by  PDZ2  provides  direct  evi¬ 
dence  for  the  coupling  of  syndecan-mediated  signal¬ 
ing  to  actin  regulation  by  merlin. 

Introduction 

Syntenin  is  a  298  residue  long  cytosolic  protein,  origi¬ 
nally  identified  as  a  molecule  linking  syndecan-mediated 
signaling  to  the  cytoskeleton  [1],  Subsequently,  syntenin 
was  also  found  to  play  a  role  in  protein  trafficking  [2, 
3],  cell  adhesion  [4],  and  activation  of  the  transcription 
factor  Sox4  [5].  Of  particular  medical  interest  is  the  re¬ 
cent  report  that  syntenin  is  overexpressed  in  breast  and 
gastric  cancer  cells  and  promotes  their  migration  and 
metastasis  [6].  The  diverse  biological  functions  of  this 
protein  are  a  result  of  its  interactions  with  numerous 
targets.  There  are  currently  at  least  ten  putative  binding 
partners  reported  for  syntenin,  including  IL-5  receptor 
c x  subunit  (IL5Ra)  [5],  neuroglian  [7],  proTGF-a  [3],  gluta- 
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mate  receptors  [8],  neurofascin  [7],  syndecan-4  [1], 
ephrin  B  [9,  10],  ephrin  A7  [9],  PTP-k)  [11],  neurexin  I 
[1 2],  and  merlin  [1 3].  All  the  binding  partners  of  syntenin 
are  receptors  except  for  merlin,  a  cytosolic  tumor  re¬ 
pressor  that  is  a  product  of  the  causal  gene  for  type  II 
neurofibromatosis  (NF)  [14].  Merlin  belongs  to  the  pro¬ 
tein  4.1  superfamily,  which  also  includes  ezrin,  moesin, 
and  radixin,  and  like  its  homologs,  it  binds  actin  [15]. 

Based  on  amino  acid  sequence  analyses,  syntenin 
was  predicted  to  contain  a  tandem  of  PDZ  domains 
(PDZ1  and  PDZ2)  preceded  by  an  N-terminal  fragment 
of  112  amino  acids  of  an  unknown  structure.  PDZ  do¬ 
mains  are  ubiquitous  signaling  domains,  with  over  400 
distinct  copies  in  the  human  genome  [16,  17],  which 
mediate  protein-protein  interactions.  They  may  occur  in 
proteins  harboring  other  domains,  such  as  SH2,  RGSL, 
PH,  DH,  or  GUK,  but  are  also  found  in  proteins  that 
contain  no  other  domains:  an  extreme  example,  MUPP, 
is  a  scaffolding  protein  with  13  PDZ  domains  [18]. 
Through  the  PDZ  domains,  signaling  proteins  bind  to 
receptors,  channels,  and  other  targets,  often  functioning 
as  membrane-associated  scaffolds  for  the  assembly  of 
signaling  complexes.  Finally,  PDZ-containing  proteins 
provide  a  means  for  subcellular  targeting  of  their  part¬ 
ners,  as  exemplified  by  the  function  of  Lin-2/CASK  [1 9], 
Lin-1 0/MINT1  [20],  and  GRIP  [21,  22]. 

PDZ  domains  are  structurally  conserved  modules, 
about  90  amino  acids  in  length,  with  a  distinct  fold  of 
six  p  strands  and  two  a  helices  [23,  24].  In  most  cases, 
they  recognize  C-terminal  sequence  motifs  of  target 
proteins  and  bind  these  peptides  in  a  pocket  between 
the  p2  strand  and  a2  helix.  The  PDZ  domains  are  typi¬ 
cally  grouped  into  three  classes  depending  on  the  target 
tripeptides:  class  I  (-S/T-X-0),  class  II  (-<£-X-<£),  and 
class  III  (-D/E-X-<£)  [1 7].  Examples  outside  this  paradigm 
are  well  documented,  and  some  PDZ  domains  show 
degenerate  specificity  [25],  It  has  also  been  reported 
that  interaction  between  adjacent  PDZ  domains  may 
modulate  peptide  binding,  further  complicating  the  pic¬ 
ture.  For  example,  the  PDZ1-PDZ2  tandem  within  PSD- 
95  appears  to  have  different  binding  properties  com¬ 
pared  to  its  isolated  PDZ  domains  [16]. 

The  multitude  of  syntenin’s  putative  partners,  which 
belong  to  all  three  classes  of  target  proteins,  suggests 
that  its  PDZ  domains  may  also  exhibit  degenerate  speci¬ 
ficities.  Furthermore,  it  has  been  reported  that  the  two 
domains  function  in  a  cooperative  fashion:  for  example, 
isolated  PDZ1  and  PDZ2  apparently  fail  to  bind  merlin- 
and  IL5Ra-derived  peptides,  whereas  binding  was  re¬ 
ported  for  full-length  protein  [5, 13].  A  requirement  for 
the  complete  tandem  was  also  reported  for  interac¬ 
tion  with  PTP-ti  [11]  and  proTGF-a  [3],  whereas  synde- 
can-2  was  reported  to  bind  to  PDZ1-PDZ2  or  PDZ2- 
PDZ2  tandem,  but  neither  the  isolated  domains  nor 
PDZ1-PDZ1  [12]. 

In  order  to  explain  the  molecular  basis  for  the  ob- 
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served  properties  of  syntenin,  we  have  initiated  a  sys¬ 
tematic  study  aimed  at  characterizing  syntenin’s  molec¬ 
ular  structure  and  mechanism  of  action.  Here,  we  report 
the  crystal  structure  of  the  intact  PDZ  tandem,  residues 
113-273,  refined  at  1.94  A  resolution.  The  structure  re¬ 
veals  two  PDZ  domains  that  have  fully  solvent-accessi¬ 
ble  peptide  binding  pockets.  We  also  report  the  results 
of  rigorous  biophysical  binding  assays  for  isolated  PDZ 
domains,  the  PDZ  tandem,  and  full-length  syntenin,  with 
peptides  derived  from  three  selected  putative  binding 
partners  for  syntenin:  IL5Ra,  syndecan-4,  and  merlin. 
These  data  reveal  that  the  binding  properties  of  syntenin 
are  a  result  of  the  independent  binding  events  of  the 
two  PDZ  domains,  whose  specificities  show  clear  de¬ 
generacy.  The  merlin-derived  octapeptide  shows  the 
highest  affinity  for  syntenin,  and  a  distinct  selectivity  for 
PDZ1.  This  result  reaffirms  that  merlin  is  a  physiologi¬ 
cally  important  partner  for  syntenin. 

Results  and  Discussion 

Model  Quality  and  Structure  Overview 
The  structure  was  solved  with  a  three -wavelength  MAD 
experiment,  using  a  SeMet-labeled  protein  crystal.  The 
model,  refined  at  1 .94  A  resolution  to  a  crystallographic 
R  value  of  18.4%  (R1ree  22.7%),  contains  a  noncrystallo- 
graphic  dimer  of  tandems  in  the  asymmetric  unit  and  a 
total  of  325  residues  (Table  1;  Figure  1).  The  only  syn¬ 
tenin  residue  not  included  in  the  model  is  the  C-terminal 
phenylalanine  of  the  second  monomer.  The  refined 
structure  conforms  well  to  standard  protein  stereo¬ 
chemistry,  with  rms  deviation  from  ideal  bond  lengths  of 
0.015  A,  and  with  only  2  residues  falling  into  disallowed 
regions  of  the  Ramachandran  plot  as  judged  by  Mol- 
Probity  [26].  Only  seven  side  chains  are  not  visible  in 
the  CTA-weighted  2mFob$  -  DF^  electron  density  map 
contoured  at  1  cr  (Figure  1 C),  and  their  occupancies  were 
set  to  zero.  The  average  isotropic  temperature  factor 
(B)  for  main  chain  atoms  js  20.6  A2,  with  the  highest 
temperature  factors  (~50  A2)  associated  with  the  linker 
peptides  and  the  C-terminal  end  of  the  a2  helix  of  PDZ1 , 
all  of  which  are  nonetheless  clearly  visible. 

The  crystallized  fragment  of  syntenin  contains  two 
PDZ  domains  conjoined  by  a  short  linker  (Arg193- 
Pro194-Phe195-Glu196;  Figure  1C).  Like  other  domains 
from  this  superfamily,  the  syntenin  PDZ  modules  show 
a  typical  fold  with  two  opposing  antiparallel  p  sheets 
capped  by  two  a  helices.  Each  domain  has  at  least  one 
p  strand  that  is  partly  contained  in  both  sheets.  In  the 
crystal,  the  two  PDZ  tandems  of  syntenin  are  arranged  in 
a  head-to-tail  fashion,  related  by  a  noncrystal lographic 
dyad,  giving  the  contents  of  the  asymmetric  unit  the 
appearance  of  a  four-leaf  clover.  Interestingly,  the  linker 
residue  Arg193  forms  a  salt  bridge  with  Glu240  in  PDZ2 
and  forms  a  hydrogen  bond  via  its  N8  with  a  main  chain 
carboxyl  group  of  Phe154  in  PDZ1.  This  may  help  to 
explain  why  Arg193  falls  into  a  disallowed  region  of 
the  Ramachandran  plot.  Superposition  of  the  monomers 
reveals  that  there  is  a  slight  difference  in  the  angle  be¬ 
tween  the  two  PDZ  domains  in  the  two  monomers,  ex¬ 
plaining  why  the  dimer  is  noncrystallographic.  This  sug¬ 
gests  that  the  linker  has  considerable  intrinsic  flexibility 
in  solution. 


We  note  that  the  interaction  between  the  two  PDZ 
domains  within  a  monomer  is  less  extensive  than  the 
intermolecular  PDZ1-PDZ2  interface.  A  total  of  446  A2 
of  solvent-accessible  surface  is  buried  by  each  intermo¬ 
lecular  interaction.  Furthermore,  this  interface  is  fairly 
intimate,  with  a  number  of  hydrogen  bonds  between  the 
two  domains.  The  few  solvent  molecules  that  are  at  this 
interface  mediate  contacts  between  the  two  domains. 
In  contrast,  there  are  no  direct  hydrogen  bonds  between 
the  PDZ  domains  within  a  monomer.  Both  putative  pep¬ 
tide  binding  grooves  of  syntenin  are  located  on  the  same 
face  of  the  tandem  monomer  and  are  completely  ex¬ 
posed  to  the  solvent,  suggesting  that  syntenin  has  two 
distinct  and  functional  peptide  binding  sites. 


Structure  of  the  PDZ  Domains  of  Syntenin 
A  structural  comparison  of  the  two  PDZ  domains  of 
syntenin  reveals  that,  in  spite  of  a  modest  level  of  se¬ 
quence  identity  (26%),  they  are  structurally  very  similar 
to  each  other,  with  an  rms  deviation  of  1 .2  A  on  Ca 
atoms.  In  both  domains,  the  fragment  equivalent  to  the 
signature  GLGF  loop  involved  in  the  terminal  carboxyl- 
ate  binding  deviates  from  the  paradigm  by  an  insertion 
of  a  basic  residue  after  the  initial  glycine  (Arg  in  PDZ1 
and  His  in  PDZ2).  Such  insertions  are  rarely  found  in 
PDZ  domains,  but  they  do  not  seem  to  disturb  the  cluster 
of  main  chain  amides  that  coordinate  the  incoming  car- 
boxylate  of  the  target  peptide.  Typically,  a  Lys  or  Arg 
located  4  or  5  residues  prior  to  this  loop  assists  in  pep¬ 
tide  binding  via  a  water-mediated  hydrogen  bond.  Both 
of  syntenin’s  PDZ  domains  have  a  lysine  4  residues 
before  the  initial  glycine. 

In  spite  of  these  similarities,  there  are  some  notable 
differences  between  the  two  domains.  The  most  appar¬ 
ent  is  the  length  of  the  p2-p3  loop.  When  compared  to 
PDZ2,  where  this  loop  is  shorter  than  in  most  other  PDZ 
domains,  PDZ1  contains  an  insertion  of  4  residues  in 
the  p2-p3  loop  (KSIDNGIF  versus  KN— GK). 

Furthermore,  in  PDZ1 ,  the  peptide  binding  groove  is 
narrower  as  compared  to  PDZ2  or  other  PDZ  domains 
(Figure  2A).  This  is  best  illustrated  by  comparing  the 
distance  from  the  beginning  of  a2  to  the  P2  strand  of 
PDZ2,  to  the  corresponding  distance  in  PDZ1 ,  which  is 
1 .8  A  shorter. 

The  electrostatic  potential  surrounding  the  peptide 
binding  groove  is  another  significant  difference  between 
the  two  PDZ  domains.  The  peptide  binding  surface  of 
PDZ1  is  predominately  positively  charged,  surrounded 
by  3  residues  (Lysl  24,  Argl  28,  and  Lysl  30)  from  p2  and 
2  residues  (His175  and  Lysl  79)  from  a2.  Other  basic 
residues  flank  this  region.  PDZ2  lacks  any  clusters  of 
positively  charged  residues,  with  His208  as  the  only 
charged  side  chain  that  extends  over  the  peptide  bind¬ 
ing  groove  (Figure  2B). 

As  a  peptide  binds  to  PDZ  domains,  it  mimics  an 
additional  antiparallel  strand  in  the  sheet  containing  p2. 
The  position  of  the  p2  strand  in  both  of  syntenin’s  PDZ 
domains  is  consistent  with  this  mechanism,  with  the 
amino  and  carboxyl  groups  of  Leul  29  and  Phe21 1  avail¬ 
able  for  hydrogen  bonding.  This  type  of  arrangement 
dictates  that  the  terminal  side  chain  of  the  peptide  faces 
the  interior  of  the  binding  pocket.  Both  PDZ  domains 
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Table  1.  Data  Collection  and  Refinement  Statistics 

Edge 

Peak 

Remote 

Data  Collection  Statistics 

Wavelength 

0.97946 

0.97900 

0.97133 

Resolution  (A) 

30.0-1.94  (2.01-1.94)" 

30.0-1.94  (2.01-1.94)" 

30.0-1.94  (2.01-1.94)" 

Total  reflections 

84,171 

85,123 

86,109 

Unique  reflections 

22,557 

22,608 

23,095 

Redundancy 

3.7 

3.8 

3.7 

Completeness  {%) 

97.4  (75.1) 

97.1  (71.9) 

98.8  (90.2) 

Raym  (%)'’ 

4.9  (25.9) 

6.2  (31.2) 

4.8(31.1) 

Average  I h  (1) 

20.2 

23.1 

19.7 

Phasing  Statistics 

Phasing  power,'  iso/an o 

0.37/0.27 

0.23/0.35 

— /0.31 

Rcufi«*d  Iso/ano 

Overall  figure  of  merit  (acentric):  0.68 

0.43/0.69 

0.71/0.57 

— /0.66 

Refinement  Statistics 

Resolution  (A) 

30.0-1.94  (1.99-1.94) 

Reflections  (working) 

21,926  (1,560) 

Reflections  (test) 

1,182  (79) 

RwocK  (%)• 

18.4  (24.0) 

RfrM  <%)e 

22.7  (25.2) 

Number  of  waters 

Rms  deviation  from  ideal  geometry 

254 

Bonds  (A) 

0.015 

Angles  (°) 

Average  B  factor  (A2) 

1.80 

Main  chain 

20.6 

Side  chain 

26.3 

Waters 

47.0 

“The  numbers  in  parentheses  describe  the  relevant  value  for  the  highest  resolution  shell. 

bRtym  =2|I|  -  <I>|/2I,  where  I|  is  the  intensity  of  the  i-th  observation  and  <l>  is  the  mean  intensity  of  the  reflections.  The  values  are  for 

unmerged  Friedel  pairs. 

'Phasing  power  =  <[(Fh(calc)|/phase-integrated  lack  of  closure]  > 
dRcuiHS  =  < phase- integrated  lack  of  closure>/<|Fph  -  Fp|> 

“R  =  X||F„J  -  |FcJ|/£lFobJ,  crystallographic  R  factor,  and  R^  =  2||Fobs| 

-  IFcaicll/SlFoJ,  where  all  reflections  belong  to  a  test  set  of  randomly 

selected  data. 

have  a  substantial  hydrophobic  pocket  near  the  binding 
loop  that  could  accommodate  any  of  the  large  hy¬ 
drophobic  side  chains.  The  walls  and  floor  of  the  peptide 
binding  groove  in  both  PDZ  domains  are  lined  with  hy¬ 
drophobic  residues.  Neither  PDZ  domains  has  a  histi¬ 
dine  at  the  first  position  of  a2,  as  is  found  in  typical  class 
I  PDZ  domains.  In  syntenin,  this  position  is  occupied  in 
PDZ1  and  PDZ2  by  Seri  71  and  Asp251,  respectively. 
The  side  chains  of  both  of  these  residues  hydrogen  bond 
to  main  chain  amides  at  the  end  of  p2.  Overall,  both  of 
the  PDZ  domains  of  syntenin  appear  to  be  suitable  for 
peptide  binding,  although  the  structural  differences  sug¬ 
gest  diverse  specificities. 

Stability  Studies 

Although  the  crystal  structure  indicates  that  both  PDZ 
domains  of  syntenin  are  capable  of  binding  peptides, 
many  previously  reported  binding  studies  using  isolated 
PDZ1  and  PDZ2  domains  have  failed.  In  order  to  better 
assess  the  feasibility  of  performing  binding  assays  with 
isolated  PDZ1  or  PDZ2,  we  conducted  stability  studies 
using  chemical  denaturation  monitored  by  circular  di- 
chroism  (Figure  3).  Full-length  syntenin,  the  PDZ  tan¬ 
dem,  and  isolated  PDZ1  (residues  113-193)  and  PDZ2 
(residues  1 97-273)  were  used  in  the  assay.  Surprisingly, 


we  found  that  the  isolated  PDZ1  and  PDZ2  have  signifi¬ 
cantly  different  stabilities:  the  free  energy  of  unfolding, 
AGun,  for  PDZ1  is  -3.2  kcal/mol,  whereas  for  PDZ2  it 
is  -4.8  kcal/mol,  putting  it  closer  to  the  average  of  5-15 
kcal/mol  observed  for  globular  proteins.  Based  on  these 
values,  the  expected  denaturation  of  the  tandem,  as 
simulated  by  combining  single  domain  transitions, 
should  be  less  cooperative,  with  a  AGun  of  -2.07  kcal/ 
mol  (Figure  3,  insert).  However,  the  experimental  un¬ 
folding  of  the  tandem  follows  a  cooperative,  two-state 
profile,  with  a  AGun  of  -4.1  kcal/mol,  suggesting  that  the 
domains  are  associated  into  a  single  entity  undergoing 
cooperative  denaturation.  As  already  noted,  PDZ1  of 
one  molecule  interacts  with  PDZ2  of  the  adjacent  one  in 
the  crystal  structure.  The  large  buried  interface  suggests 
that  the  interaction  could  be  physiologically  relevant,  as 
indeed  self-association  has  been  reported  for  syntenin 
[7].  It  is  also  possible  that  the  two  domains  interact  in 
this  way  within  a  monomer,  and  that  the  crystal  structure 
corresponds  to  a  domain-swapped  conformation.  Whereas 
our  data  are  consistent  with  weak,  albeit  identifiable 
domain-domain  interactions  in  syntenin,  further  work 
will  be  required  to  probe  this  issue. 

Finally,  we  note  that  the  full-length  protein  also  un¬ 
folds  in  a  highly  cooperative  manner,  and  shows  signifi- 
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Figure  1.  The  Structure  of  the  PDZ  Tandem 
of  Syntenin 

(A)  A  stereo  Ca  trace  with  every  tenth  a  car¬ 
bon  represented  as  a  sphere  and  every  twen¬ 
tieth  a  carbon  labeied,  and  colored  from  blue 
to  red  as  a  function  of  residue  number. 

(B)  Ribbon  diagram  of  the  asymmetric  unit 
colored  by  B  factors.  B  factors  are  repre¬ 
sented  with  low  values  (12  A2)  colored  blue 
and  high  values  (43  A2)  colored  red. 

(C)  Experimentally  determined  electron  den¬ 
sity  map  of  the  linker  region  contoured  at  1  tr. 
Residues  1 89-201  are  shown  for  each  mono¬ 
mer.  Figures  were  made  using  MOLSCRIPT 
[47]  (A  and  B)  and  POVSCRIPT  +  (http -J/ 
people.brandeis.edu/Menn/povscript/)  (C) 
and  rendered  with  RASTER3D. 


cantly  higher  stability  (AGun  of  -6.4  kcai/mol)  than  any 
of  the  other  constructs.  This  result  may  imply  that  the 
N-terminal  fragment  of  syntenin  is  folded  and  plays  a 
structural  role  in  the  protein,  possibly  interacting  with 
the  PDZ  domain(s),  so  that  full-length  syntenin  dena¬ 
tures  as  a  single  entity. 


Binding  Properties  of  Syntenin  and  of  isolated 
PDZ  Domains 

In  the  case  of  full-length  syntenin  and  the  PDZ  tandem, 
we  were  interested  in  both  the  affinities  and  the  stoichio¬ 
metries  of  peptide  binding.  With  that  in  mind,  we  carried 
out  binding  assays  using  isothermal  titration  calorimetry 
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Figure  2.  A  Comparison  of  PDZ1  and  PDZ2  Domains  of  Syntenin 

(A)  Superposition  of  the  two  PDZ  domains  of  syntenin.  PDZ1  is  gold  and  PDZ2  is  blue.  The  a2  helices  have  been  superposed  to  show  the 
similarity  of  the  fold,  yet  emphasize  the  differences  of  the  peptide  binding  groove.  The  same  orientation  is  used  for  all  three  figures. 

(B)  The  peptide  binding  surface  of  PDZ1 .  The  electrostatic  potential  surface  is  shown  with  select  residues  that  surround  the  peptide  binding 
groove  labeled.  A  superposed  C-terminal  CRIPT-derived  peptide  from  the  structure  of  PSD-95  (1BE9  [23])  is  shown  semitransparent,  with 
side  chains  represented  as  cyan  spheres  in  the  (S  carbon  position.  The  approximate  locations  of  the  P0,  P.i,  and  P.2  binding  pockets  are 
indicated  by  gold,  pink,  and  green  circles,  respectively. 

(C)  The  peptide  binding  surface  of  PDZ2  represented  as  described  in  (B).  Figures  were  made  using  MOLSCRIPT  [47]  (A)  and  POVSCRIPT+ 
(http://people.brandeis.edu/Menn/povscript/)  (B  and  C)  and  rendered  with  RASTER3D  [48].  Electrostatic  potentials  were  calculated  in  GRASP  [49]. 


(ITC).  The  assays  were  conducted  with  the  following 
hexapeptides  derived  from  three  putative  targets  of  syn¬ 
tenin,  each  belonging  to  one  of  the  three  canonical 
classes:  LEDSVF  (IL5Ra)  representing  class  I,  TNEFYA 
(syndecan-4)  from  class  II,  and  AFFEEL  (merlin)  from 
class  III.  We  found  that  all  peptides  bind  to  full-length 
syntenin  and  to  PDZ  tandem  with  dissociation  constants 
(Kd)  in  the  low  jjlM  range  (Figure  4;  Table  2).  Interestingly, 
the  ILSRa-derived  peptide  shows  a  stoichiometry  of  2:1 
for  the  tandem,  but  only  1:1  for  full-length  syntenin, 
whereas  all  other  measurements  indicate  equimolar 
complexes.  This  is  further  evidence  that  suggests  a 
functional  role  for  the  N-terminal  domain. 

To  assess  whether  residues  upstream  of  the  C-ter¬ 
minal  hexapeptide  contribute  to  binding,  we  performed 


the  assays  with  octapeptides  for  merlin  and  IL5Ra.  The 
results  are  very  similar  for  IL5Ra,  but  the  merlin  octapep- 
tide  binds  an  order  of  magnitude  more  tightly  than  the 
corresponding  hexapeptide,  indicating  the  functional 
importance  of  residues  -6  and  -7. 

The  determination  of  Kd  values  for  isolated  domains 
by  ITC  proved  difficult,  because  PDZ1  aggregated  at  the 
required  high  concentration  and  isolated  PDZ2  (residues 
1 97-273)  was  prone  to  oligomerization  (data  not  shown). 
To  overcome  the  aggregation  problems,  we  used  a  fluo¬ 
rescence-based  approach  using  dansylated  hexapep¬ 
tides,  allowing  for  significantly  lower  protein  concentra¬ 
tion  [27].  We  also  changed  the  PDZ2  construct  to 
residues  197-298,  which  includes  syntenin’s  natural  C 
terminus.  To  assess  whether  either  the  technique  or 


Figure  3.  Stability  of  Syntenin  Constructs 
GdmCI-induced  unfolding  of  PDZ1  (•),  PDZ2 
(O),  tandem  of  PDZ  domains  (A),  and  full- 
length  (V)  of  syntenin.  Measurements  were 
performed  in  25  mM  Tris,  50  mM  NaCI  (pH 
7.4).  Transitions  were  monitored  by  the 
changes  of  the  CD  signal  at  222  nm.  Data 
were  normalized  as  “fraction  unfolded”  and 
fitted  to  the  equation  in  the  text.  Insert:  com¬ 
bined  single  domain  transitions  {□)  and  tan¬ 
dem  of  PDZ  domains  transition  (■)• 
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Figure  4.  Representative  Calorimetric  Titration  of  PDZ  Tandem  of 
Syntenin  with  LEDSVF  Peptide 

Top:  raw  heat  data  corrected  for  base  drift,  obtained  from  14  con¬ 
secutive  injections  of  11.2  mM  LEDSVF  peptide  into  a  sample  cell 
(1,250  |xl)  containing  140  ^.M  PDZ  tandem  of  syntenin. 

Bottom:  the  binding  isotherm  created  by  plotting  the  areas  under 
the  peaks  against  the  molar  ratio  of  the  peptide  added  to  the  PDZ 
tandem  present  in  the  cell  and  the  fit  line  to  the  model  of  independent 
sites.  The  heats  of  mixing  (dilution)  have  been  subtracted. 


peptide  dansylation  influenced  the  observed  affinities, 
we  included  the  PDZ  tandem  in  our  measurements  as 
a  control.  We  found  that  the  fluorescence  data  agree 
well  with  the  calorimetric  titrations  (Figure  5;  Table  3). 
The  IL5Ra  peptide  binds  to  both  PDZ  domains  with 
similar  Kd  values  in  the  mid-p-M  range,  in  agreement 


Figure  5.  Binding  of  Dansyl-Labeled  Peptides  to  Syntenin 
Binding  of  dansyl-RVAFFEEL  to  PDZ1  (+),  dansyl-AFFEEL  to  PDZ1 
(•),  dansyl-LEDSVF  to  PDZ1  (■),  and  PDZ2  (□)  and  dansyl-TNEFYA 
to  PDZ2  (A).  Data  were  normalized  as  “fraction  bound/’  so  that  the 
initial  fluorescence  was  zero  and  the  fluorescence  at  saturation  was 
equal  to  unity. 


with  the  2:1  stoichiometry  observed  by  ITC.  The  merlin- 
derived  peptide  shows  no  significant  affinity  toward 
PDZ2,  but  binds  to  PDZ1 ,  to  the  tandem  and  the  full- 
length  protein,  with  almost  identical  Kd  values  in  the 
sub-p-M  range.  The  merlin  octapeptide  binds  to  PDZ1 
with  significantly  higher  affinity  than  the  hexapeptide,  in 
agreement  with  the  ITC  results.  The  syndecan-4  peptide 
interacts  exclusively  with  PDZ2,  with  an  affinity  virtually 
identical  to  those  observed  for  the  PDZ  tandem  and  full- 
length  syntenin.  This  result  is  again  consistent  with  the 
1:1  stoichiometry  determined  by  ITC.  However,  it  is  in 
conflict  with  the  previously  reported  2:1  stoichiometry 
for  the  whole  C-terminal  domain  of  syndecan-2  [1 2]. 

It  has  been  suggested  previously  that  the  N-terminal 
fragment  of  syntenin  plays  a  regulatory  function.  For 
example,  the  association  of  PTP-ti  with  syntenin  was 
shown  to  be  regulated  by  tyrosine  phosphorylation 
within  this  fragment,  with  phosphorylation  preventing 
the  association  [11].  This  indicates  that  the  N-terminal 
fragment  may,  at  least  under  some  conditions,  regulate 
the  availability  of  at  least  one  of  syntenin’s  peptide  bind¬ 
ing  pockets.  Our  data  do  not  provide  a  clear  answer  in 


Table  2.  Isothermal  Calorimetry  of  Syntenin  Interactions 


LEDSVF 

(IL5R«) 

ETLEDSVF 

(IL5R«) 

AFFEEL 

(Merlin) 

RVAFFEEL 

(Merlin) 

TNEFYA 
(Syn  decan) 

PDZ  Tandem 

Kd 

43.8  p-M 

32.2  m-M 

11.6  nM 

200  nM 

2.9  fiM 

n 

2.26 

2.07 

0.94 

1.12 

1.11 

AH 

-2.1  kJ 

-7.6  kJ 

-7.1  kJ 

-3.1  kJ 

-5.4  kJ 

Full-Length 

Kd 

19.5  m-M 

10.1  *iM 

8.9  |xM 

869  nM 

2.5  pM 

n 

1.09 

1.12 

1.3 

1.16 

1.14 

AH 

-4.7  kJ 

-8.87 

-8.9  kJ 

-7.3  kJ 

-9.8  kJ 

Dissociation  constants,  stoichiometries,  and  enthalpies  for  the  interactions  of  the  IL5Rot-,  merlin-,  and  syndecan-derived  peptides,  with  the 
PDZ  tandem  of  syntenin  and  full-length  protein,  determined  by  ITC.  Representative  data  are  shown  for  experiments  that  were  conducted  at 
least  twice  for  each  interaction. 
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Table  3. 

Dissociation  Constants  of  Syntenin  Interactions  by  Fluorimetric  Titrations 

DNS-EDSVF 

DNS-FFEEL 

DNS-VAFFEEL 

DNS-NEFYA 

(IL5Ra) 

(Merlin) 

(Merlin) 

(Syndecan) 

PDZ- PDZ  18.5  m-M  (±3.2) 

6.8  \iM  (±1.1) 

495  nM  (±55) 

2.1  fxM  (±0.3) 

PDZ1 

10.6  *jlM  (±0.8) 

5.0  ^M  (±0.7) 

268  nM  (±28) 

>1  mM 

PDZ2 

1.9  nM  (±0.3) 

>0.6  mM 

>1  mM 

2.3  \xM  (±0.5) 

Dissociation  constants  for  the  interactions  of  the  IL5Ra-,  merlin-,  and  syndecan-derived  dansylated  peptides,  with  the  PDZ  tandem  of  syntenin 
and  isolated  PDZ  domains,  determined  by  fiuoiimetric  titration.  Estimates  of  error  are  derived  from  experimental  data. 

this  regard.  The  ITC  data  show  that  the  IL5Ra  peptide 
binds  to  only  one  PDZ  domain  of  full-length  syntenin, 
but  to  both  in  the  tandem.  Merlin,  which  binds  only  to 
PDZ1 ,  does  so  within  the  context  of  full-length  syntenin 
as  well,  suggesting  that  it  is  the  PDZ2  that  is  occluded 
by  the  N-terminal  domain.  However,  the  syndecan-4 
peptide  is  selective  for  PDZ2  and  also  binds  to  full- 
length  syntenin.  Further  experiments  will  be  necessary 
to  resolve  this  inconsistency. 

Taken  together,  the  structural  and  binding  data  indi¬ 
cate  that  the  two  domains  within  the  syntenin  PDZ  tan¬ 
dem  function  independently.  Each  domain  shows  de¬ 
generate  specificity,  so  that  PDZ1  binds  peptides  from 
merlin  and  IL5Ra,  whereas  PDZ2  shows  affinity  toward 
lL5Ra  and  syndecan-4.  Although  our  data  are  internally 
consistent  and  reproducible,  they  are  in  conflict  with 
some  reports  in  the  literature  that  claim  individual  PDZ 
domains  are  incapable  of  binding  peptides. 

Similarities  to  Other  PDZ  Domains 
As  the  number  of  known  PDZ  domains  grows  and  their 
importance  in  a  myriad  of  cellular  events  becomes  evi¬ 
dent,  numerous  attempts  have  been  made  to  elucidate 
the  factors  that  govern  their  specificity.  High-resolution 
crystal  structures  and  solution  NMR  structures  have 
now  been  determined  for  a  number  of  PDZ  domains 
that  were  classified  into  distinct  groups.  It  is  clear  that 
the  overall  fold  of  the  domain  is  well  conserved,  and  the 
specificity  is  governed  by  subtle  structural  and  amino 
acid  sequence  variation.  The  application  of  generalized 
rules  for  governing  PDZ  domain  specificity  is  compli¬ 
cated  by  those  PDZ  domains  that  show  degenerate 
specificity  for  more  than  one  archetypal  class  of  peptide. 
Syntenin  is  one  of  the  examples  of  this  growing  group. 
Both  of  syntenin’s  PDZ  domains  fit  the  overall  fold  of 
PDZ  domains  well,  with  an  average  rms  deviation  from 
the  known  X-ray  structures  of  1 .4  A  and  1 .1  A  for  PDZt 
and  PDZ2,  respectively. 

Syntenin,  with  its  tandem  PDZ  structure,  appears  to 
have  been  well  conserved  during  evolution.  The  rat  and 
mouse  syntenins  are  virtually  identical  to  the  human 
protein.  Recently,  the  jumbo  tiger  shrimp  Penaeus 
monodon  was  reported  to  contain  a  protein  similar  to 
syntenin  with  extremely  high  amino  acid  sequence  iden¬ 
tity,  wheh  compared  to  the  human  protein,  of  56%  and 
64%,  for  PDZ1  and  PDZ2  domains,  respectively  [28]. 
We  conducted  a  BLAST  search  of  the  genome  of  the 
malaria  vector  Anopheles  gambiae,  and  found  a  protein 
annotated  as  a  syntenin,  with  50%  amino  acid  identity 
to  the  human  molecule.  It  is  noteworthy  that  both  the 
Anopheles  and  Penaeus  homologs  are  far  more  similar 
to  the  human  protein  than  any  other  PDZ  domain  in  the 


human  genome  (Figure  6).  Ciona  intestinalist  a  primitive 
tunicate  with  the  smallest  known  genome  among 
Chordata ,  shows  the  presence  of  sequences  highly  simi¬ 
lar  (50%-60%  identity)  to  the  human  protein.  The  high 
sequence  similarity  among  such  diverse  species  sug¬ 
gests  that  the  molecule  predates  the  appearance  of 
vertebrates. 

Syntenin:  A  Link  between  Syndecan 
and  the  Actin  Cytoskeleton 

The  biological  function  of  syntenin  and  its  domain  struc¬ 
ture  appear  to  have  been  stringently  guarded  by  evolu¬ 
tionary  mechanisms.  The  present  study  strongly  sup¬ 
ports  earlier  suggestions  that  merlin  is  a  physiologically 
relevant  partner  for  syntenin.  Because  merlin  is  an  actin 
binding  protein,  syntenin  may  provide  another  link  for 
syndecan-regulated  signaling  to  the  cytoskeleton,  with 
syntenin  meditating  the  colocalization  of  syndecan, 
through  PDZ2,  and  merlin,  through  PDZ1 .  It  will  be  com¬ 
parable  to  the  current  model  of  syndecan  signaling  to 
actin,  through  PDZ-containing  CASK  and  protein  4.1 
[29]  or  direct  binding  to  another  of  the  ERM  proteins, 
ezrin  [30,  31].  The  FERM  domain  of  merlin  binds  ezrin 
and  could  block  the  interaction  of  ezrin  to  actin  [31], 
This  alternate  anchoring  signal  pathway  may  give  clues 
regarding  the  involvement  of  syntenin  in  metastasis  or 
the  tumor  suppressor  function  of  merlin.  The  identifica¬ 
tion  of  the  syntenin  homolog  in  Anopheles  prompted  us 
to  look  in  the  mosquito’s  genome  for  the  homologs  of 
merlin  and  syndecan.  We  found  an  annotated  merlin 
homolog  with  a  57%  amino  acid  identity  to  the  human 
protein  and  fully  conserved  C-terminal  RVAFFEEL  se¬ 
quence.  Similarly,  we  found  the  presence  of  a  syndecan- 
related  protein,  with  a  highly  conserved  C  terminus  con¬ 
taining  the  TNEFYA  motif. 

Biological  Implications 

Syntenin  is  a  ubiquitous  protein  involved  in  protein  tar¬ 
geting  and  multiprotein  assembly,  and  it  is  overex¬ 
pressed  in  certain  cancer  cell  lines.  As  inferred  from 
numerous  yeast  two-hybrid  screens  and  other  biochem¬ 
ical  assays,  syntenin  binds  biologically  important  recep¬ 
tors  such  as  IL5Ra  and  syndecan,  as  well  as  the  cyto¬ 
solic  actin  regulator  merlin,  which  is  a  tumor  suppressor 
and  a  product  of  the  causal  gene  of  neurofibromatosis 
type  II.  The  crystal  structure  of  the  biologically  functional 
fragment  of  syntenin,  residues  1 1 3-273,  solved  at  1 .94  A 
resolution,  reveals  the  presence  of  two  canonical  PDZ 
domains,  connected  by  a  4  residue  linker.  Both  domains 
appear  to  be  free  to  interact  with  target  peptides.  It  is 
the  first  crystal  structure  containing  more  than  one  PDZ 
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Figure  6.  Amino  Acid  Sequence  Alignment  of  Human  Syntenin  with  the  Anopheles  and  Penaeus  Homologs 
The  secondary  structural  elements  shown  correspond  to  the  PDZ  tandem  presented  in  this  work. 


domain  from  a  single  protein.  Our  binding  studies,  using 
stringent  biophysical  techniques  such  as  isothermal  ti¬ 
tration  calorimetry  and  fluorimetric  titration,  show  that 
the  properties  of  the  tandem  are  the  sum  of  the  binding 
properties  of  the  individual  domains,  with  no  detectable 
cooperative  effects.  Each  domain  is  able  to  bind  pep¬ 
tides  belonging  to  two  different  classes:  PDZ1  binds 
peptides  corresponding  to  merlin  and  IL5Ra  (classes  I 
and  III),  whereas  PDZ2  interacts  with  peptides  derived 
from  IL5Ra  and  syndecan-4  (classes  I  and  II).  The  sepa¬ 
rate  interactions  of  merlin  with  PDZ1  and  that  of  synde¬ 
can-4  with  PDZ2  suggest  the  physiological  coupling  of 
syndecan  to  merlin  through  syntenin.  Because  merlin 
binds  actin,  this  pathway  could  be  vital  for  merlin’s  func¬ 
tion  as  a  tumor  repressor.  The  recently  completed  ge¬ 
nome  of  the  malaria  vector  Anopheles  gambiae  contains 
homologs  of  both  syntenin  and  merlin,  indicating  that 
this  pathway  has  been  conserved  during  evolution. 

Experimental  Procedures 

Expression  and  Purification  of  Protein  Samples 
A  syntenin  clone  was  obtained  from  American  Tissue  and  Culture 
Collection  (ATCC  72537).  The  DNA  encoding  full-length  (residues 
1  -298),  PDZ  tandem  (1 1 3-273),  PDZ1  (1 1 3-1 93),  and  PDZ2  (1 97-273 
and  197-298)  domains  of  syntenin  were  amplified  by  PCR  and 
cloned  into  the  parallel  vector  pGST-parallell  [32],  a  GST-fusion 


protein  expression  vector  containing  the  recombinant  TEV  protease 
(rTEV)  cleavage  site.  The  integrity  of  the  insert  was  verified  by  direct 
DNA  sequencing.  The  expression  of  the  proteins  was  induced  by  1 
mM  IPTG  in  E  cofi  BL21  strain  (Stratagene).  The  SeMet-labeled 
PDZ  tandem  was  expressed  in  the  D834  strain  (Novagen)  with  M9 
medium  with  addition  of  SeMet.  The  expressed  proteins  were  puri¬ 
fied  by  affinity  chromatography  using  a  glutathione-Sepharose  4B 
(Amersham  Pharmacia  Biotech).  The  recombinant  protein  was  sub¬ 
jected  to  a  HiPrep  26/10  desalting  column  (Amersham  Pharmacia 
Biotech)  equilibrated  with  50  mM  Tris-HCI  (pH  7.5),  150  mM  NaCI, 
and  was  digested  using  rTEV  (Life  Technologies)  at  10°C  in  the 
presence  of  0.5  mM  EDTA,  1  mM  DTT.  After  complete  digestion,  the 
GST  tag  was  removed  using  a  glutathione-Sepharose  4B  column.  A 
gel  filtration  was  performed  with  a  Superdex  G75  column  (Amersham 
Pharmacia  Biotech)  equilibrated  with  20  mM  Tris-HCI  (pH  7.5),  and 
the  fractions  containing  the  PDZ  tandem  were  collected  and  con¬ 
centrated  using  Centriprep  YM10  for  crystallization  screening.  The 
purified  proteins  contain  an  additional  five  amino  acids  (GAM DP)  at 
the  N  terminus  due  to  the  cloning  procedure. 

Crystallization  and  Data  Collection 

Crystal  Screen  (Hampton  Research)  was  used  for  preliminary 
screening.  Subsequently,  crystallization  conditions  were  optimized 
around  0.1  M  sodium  acetate  (pH  4.6)  containing  24%  PEG4000 
and  0.2  M  ammonium  acetate.  The  sitting  drop  vapor  diffusion 
method  was  used  for  all  crystallization  trials.  Drops  were  formed 
with  3  pJ  of  protein  solution  (6  mg/ml)  and  3  |xl  of  reservoir  buffer, 
and  were  overlaid  with  15  pi  of  a  1:1  mixture  of  silicon  and  mineral 
oil.  Crystallization  trays  were  stored  at  21  °C.  The  best  crystals  were 
obtained  after  microseeding.  The  crystals  used  for  data  collection 
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were  briefly  soaked  in  the  crystallization  buffer  containing  12.5% 
(v/v)  glycerol  and  frozen  by  immersion  in  liquid  nitrogen.  The  struc¬ 
ture  was  solved  using  a  three-wavelength  MAD  experiment  with 
SeMet-Iabeled  crystals,  with  180  images  (1°  rotation)  collected  for 
each  wavelength.  Data  were  collected  at  0.97946  A  (edge  wave¬ 
length),  0.971 33  A  (remote  wavelength),  and  0.97900  A  (peak  wave¬ 
length).  The  crystals  are  in  space  group  C2,  with  unit  cell  parameters 
a  =  100.7  A,  b  =  48.7  A,  c  =  74.7  A,  and  p  =  120.8n.  All  data  were 
collected  at  beamline  X9B  at  the  NSLS,  and  processed  and  scaled 
using  HKL2000  [33].  Data  collection  statistics  are  presented  in  Table 

1 .  The  programs  SnB  [34]  and  SHELXS  [35]  were  used  to  locate  six 
of  the  eight  selenium  atoms  in  the  asymmetric  unit.  Phases  were 
generated  in  SHARP  [36]  and  improved  by  density  modification  in 
SOLOMON  [37].  These  phases  were  used  as  the  starting  point  for 
automatic  model  building  in  ARP/wARP  [38].  This  generated  275  of 
the  332  residues  in  the  asymmetric  unit.  After  manually  determining 
which  of  the  resulting  polypeptide  chains  belonged  to  each  mono¬ 
mer  in  the  asymmetric  unit,  ARP/wARP  [38]  was  used  to  dock  the 
side  chains.  Manual  model  building  was  performed  in  O  [39].  All  but 
one  of  the  322  syntenin  residues  in  the  asymmetric  unit  were  in¬ 
cluded  in  the  model,  as  were  4  N-termina!  residues  that  are  an 
artifact  of  subcloning.  Solvent  was  added  using  ARP/wARP  [38].  A 
combination  of  CNS  [40]  and  REFMAC5  [41]  was  used  to  refine  this 
initial  model  to  an  R  factor  of  1 8.4%  and  an  R^  of  22.7%.  Maximum 
likelihood  residuals  were  used  throughout  the  refinement  process. 
TLS  refinement  [42]  and  inclusion  of  experimental  phase  information 
[43]  from  SHARP  were  included  in  later  stages  of  refinement  to 
minimize  the  difference  between  Rwort(  and  Rfr„.  MOLPROBITY  [26] 
and  OOPS2  [44]  were  used  as  validation  tools  during  refinement 
and  rebuilding.  Refinement  data  are  presented  in  Table  1. 


under  following  conditions:  X  excitation  =  335  nm,  X  emission  - 
540  nm,  and  excitation  and  emission  slit  width  -  5  nm. 

Stability  Measurements 

Solvent  denaturations  were  performed  on  a  J-715  spectropolari- 
meter  (Jasco)  at  21  °C  with  the  automatic  titrator  (Jasco  automatic 
titration  system)  in  25  mM  Tris-HCI,  50  mM  NaC!  (pH  7.4)  and  the 
indicated  concentration  of  guanidinium  chloride  (GdmCI).  The  tran¬ 
sitions  were  monitored  by  the  decrease  of  the  CD  signal  at  222  nm 
and  at  2  nm  bandwidth.  The  apparent  free-energy  changes  in  the 
absence  of  GdmCI  {AGJ  were  determined  by  fitting  the  ellipticity 
intensity  changes  at  particular  concentrations  of  GdmCI  to  the  equa¬ 
tion  given  elsewhere  [46].  Analysis  of  the  data  was  performed  by 
the  program  Grafit  3.01  (Erithacus  Software).  GdmCI  concentration 
was  determined  by  refractometry. 
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Calorimetric  Binding  Assays 

Prior  to  the  experiment,  the  protein  solution  was  extensively  dialyzed 
at4°C  against  25  mM  phosphate  or  25  mM  Tris-HCI,  150  mM  NaCI 
(pH  8.0).  The  titration  was  performed  using  a  4200  isothermal  titra¬ 
tion  calorimeter  (CSC).  The  protein  concentration  in  the  sample  cell 
was  in  the  range  of  0.1  to  0.3  mM  with  a  cell  volume  of  1,250  jj.1. 
The  titrated  peptides  were  dissolved  to  the  concentrations  in  the 
range  of  3  to  8  mM  in  dialysis  buffer  and  injected  in  5-1 0  pi  aliquots. 
All  experiments  were  done  at  25°C.  The  titration  thermogram  was 
analyzed  with  BindWorks  Applied  Thermodynamics  software  to  ob¬ 
tain  n,  Kd,  and  AH  values.  Concentration  of  PDZ  tandem  and  full- 
length  syntenin  was  estimated  using  the  A28o  molar  absorbance  coef¬ 
ficient  calculated  from  the  number  of  Trp  and  Tyr  residues  [45].  The 
concentration  of  PDZ2  and  peptides  was  estimated  using  the  A257 
molar  absorbance  coefficient  calculated  from  the  number  of  Phe 
residues  in  the  molecules. 


Fluorimetric  Titrations 

Binding  of  peptides  to  full-length  syntenin  and  PDZ  tandem  and 
isolated  domains  did  not  produce  detectable  change  in  fluores¬ 
cence.  Therefore,  N-terminally  dansylated  peptides  were  used  to 
increase  sensitivity.  The  concentration  of  dansylated  peptide  was 
determined  using  the  molar  absorbance  coefficient  of  the  dansyl 
group  g334  =  4,600  M^crrr1.  The  binding  was  monitored  by  following 
the  increase  in  fluorescence  upon  titration  of  a  concentrated  protein 
into  a  1  cm  X  1  cm  stirred  cell  cuvette  containing  1.2  ml  of  25  mM 
Tris-HCI,  150  mM  NaCI  (pH  7.5)  and  0.5  pM  dansylated  peptide. 
The  protein  stock  concentration  was  in  the  range  of  1-1.5  mM  and 
the  signal  was  corrected  for  the  dilution  factor.  Data  were  fitted 
to  the  following  equation  [46]  by  nonlinear  least  squares  analysis 
using  the  program  Grafit  3.01  (Erithacus  Software): 


y  =  F0  + 


(1) 


where  y  is  the  fluorescence  signal,  x  is  the  concentration  of  ligand, 
Kd  is  the  dissociation  constant,  F0  is  the  initial  fluorescence  value, 
and  F,nax  is  the  fluorescence  value  at  saturation.  Experiments  were 
done  in  duplicate  at  21  °C  using  an  FP-750  spectrofluorimeter  (Jasco) 
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Summary 

Crystal  structures  of  the  PDZ2  domain  of  the  scaffold¬ 
ing  protein  syntenin,  both  unbound  and  in  complexes 
with  peptides  derived  from  C  termini  of  IL5  receptor 
{a  chain)  and  syndecan,  reveal  the  molecular  roots  of 
syntenin’s  degenerate  specificity.  Three  distinct  bind¬ 
ing  sites  (So,  S  t,  and  S-J,  with  affinities  for  hydropho¬ 
bic  side  chains,  function  in  a  combinatorial  way:  S~t 
and  S -2  act  together  to  bind  syndecan,  while  So  and 
S-t  are  involved  in  the  binding  of  ILSRo.  Neither  mode 
of  interaction  is  consistent  with  the  prior  classification 
scheme,  which  defined  the  IL5Ra  interaction  as  class 
I  (-S/T-X-<£)  and  the  syndecan  interaction  as  class  II 
(-(/>-X-<£).  These  results,  in  conjunction  with  other 
emerging  structural  data  on  PDZ  domains,  call  for  a 
revision  of  their  classification  and  of  the  existing  model 
of  their  mechanism. 

Introduction 

PDZ  domains  (postsynaptic  density  protein,  disc  large, 
and  zonula  occludens)  occur  within  numerous  multi- 
domain  cytosolic  proteins  and  mediate  their  binding  to 
receptors  and  channels,  thereby  serving  as  a  mem¬ 
brane-associated  scaffold  for  the  assembly  of  signaling 
complexes  (Harris  and  Lim,  2001;  Hung  and  Sheng, 
2002).  Over  440  domains  of  this  type  have  been  identi¬ 
fied  so  far  in  the  human  genome,  and  they  are  also 
abundant  in  other  organisms  (Sheng  and  Sala,  2001). 
PDZ  domains  are  structurally  conserved  modules  about 
90  amino  acids  in  size.  The  majority  are  believed  to 
function  by  binding  the  C-termina!  tail  of  the  target  pro¬ 
tein  in  a  structurally  conserved  groove  between  the  |32 
strand  and  the  a2  helix  (Doyle  et  al.,  1996).  The  terminal 
carboxylate  of  the  target  is  anchored  via  hydrogen 
bonds  from  three  main  chain  amides  within  a  conserved 
glycine-rich  loop,  a  fingerprint  of  the  PDZ  fold.  Early 
data  derived  from  crystallographic  and  NMR  studies 
suggested  a  general  model  of  sequence  pattern  recog¬ 
nition,  in  which  the  peptide  is  bound  in  an  extended 
conformation  so  that  two  side  chains,  P0  and  P_2,  point 
into  the  groove  of  the  PDZ  domain  and  account  for 
specificity  (P0  denotes  the  C-terminal  residue  of  the 
bound  peptide  and  P-„  denotes  the  nth  amino  acid  up¬ 
stream  of  it;  S_n  denotes  the  corresponding  binding 
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pocket  of  the  PDZ  domain).  Those  domains  that  are 
grouped  together  as  class  I  bind  Ser  or  Thr  in  P_2  and 
a  hydrophobic  residue  in  P0,  so  that  the  target  sequence 
motif  is  -S/T-X-0  (<f>  represents  hydrophobic  residues 
and  'P  represents  aromatic  residues).  Class  II  domains 
bind  another  hydrophobic  residue  at  P_2  (-<£-X-</>),  while 
a  negatively  charged  residue  at  P_2  defines  class  III 
interactions  (-D/E -X-<£).  This  simple  model  is  unable  to 
explain  an  increasing  number  of  PDZ-mediated  interac¬ 
tions  that  do  not  conform  to  this  canonical  type  of  recog¬ 
nition.  To  account  for  them,  new  classes  of  PDZ  domains 
are  being  proposed  to  extend  the  model.  For  example, 
PDZ1  of  Mintl  has  been  termed  “novel  class  III”  (-E/D- 
X-W-C/S)  (Maximov  et  al.,  1999),  and  PDZ3  of  hINADL 
has  been  placed  in  “class  IV”  (-X-'P-D/E)  (Vaccaro  and 
Dente,  2002).  These  new  classes  of  PDZ  domains  recog¬ 
nize  P_i  instead  P_2.  To  further  complicate  the  issue, 
some  PDZ  domains  recognize  more  than  one  class  of 
the  C-termina!  sequence  motif.  CIPP  PDZ3  binds  neu- 
rexin  (class  II)  and  the  NMDA  receptor  (class  I)  (Kurschner 
et  al.,  1 998),  and  the  third  PDZ  domain  of  hINADL  binds 
the  sequences  -V-D-<p  (class  II)  and  -X-'P-D  sequence 
(class  IV)  (Vaccaro  and  Dente,  2002),  while  MINT1  PDZ1 , 
hINADL  PDZ5,  and  Par6  PDZ  domains  bind  ligands  with 
sequences  -D-H-W-C  (novel  class  III)  and  -E-Y-Y-V 
(class  II)  (Bezprozvanny  and  Maximov,  2001).  The  erbin 
PDZ  domain  binds  the  receptor  ErbB2  (class  II)  and 
LET-23  peptide  (class  I)  (Borg  et  al.,  2000).  While  dual 
specificity  is  not  rare  in  PDZ  binding,  there  is  no  general 
model  accounting  for  it. 

Syntenin,  first  identified  as  a  syndecan  binding  pro¬ 
tein,  contains  a  tandem  of  PDZ  domains,  which  demon¬ 
strate  degenerate  specificity  (Grootjans  et  al.,  1 997).  In 
addition  to  syndecan,  there  are  currently  at  least  10 
binding  partners  reported  for  syntenin,  including  class 
I  proteins  such  as  interleukin  5  receptor  a  subunit  (IL5Ra) 
(-D-S-V-F)  (Geijsen  et  al.,  2001),  neuroglian  (-Y-S-L-A) 
(Koroll  et  al.,  2001),  proTGF-a  (-E-T-V-V)  (Femandez- 
Larrea  et  al.,  1999),  and  neurofascin  (-Y-S-L-A)  (Koroll 
et  al.,  2001);  class  II  molecules  such  as  syndecan 
(-E-F-Y-A),  ephrin  B  (-Y-Y-K-V)  (Lin  et  al.,  1999;  Torres 
et  al.,  1 998),  Eph  A7  (-G-I-Q-V)  (Torres  et  al.,  1 998),  PTP-t) 
(-G-Y-I-A)  (luliano  et  al.,  2001),  and  neurexin  I  (-E-Y-Y-V) 
(Grootjans  et  al.,  2000);  and  the  class  III  protein  merlin 
(-F-E-E-L)  (Jannatipour  et  al.,  2001).  In  principle,  such 
diversity  of  interactions  could  be  caused  by  degenerate 
specificity  or  alternatively  by  cooperative  effects  of  two 
PDZ  domains.  We  recently  showed  that  syntenin's  two 
PDZ  domains  show  degenerate  and  noncooperative 
binding  (Kang  et  al.,  2003).  The  second  PDZ  domain 
(PDZ2)  binds  ILSRa  (class  I)  and  syndecan-4  (class  II) 
peptides,  in  spite  of  dramatically  dissimilar  sequences, 
with  the  dissociation  constants  of  1 .9  jxM  and  2.3  jxM, 
respectively.  Mutational  studies  also  show  that  PDZ2 
has  binding  capacity  for  both  class  I  and  class  II  pep¬ 
tides  (Grootjans  et  al.,  2000;  Koroll  et  al.,  2001).  In  order 
to  elucidate  the  molecular  basis  for  the  dual  specificity 
of  the  PDZ2  domain  of  syntenin,  we  determined  the 
crystal  structures  of  the  PDZ2  domain  alone  and  in  com- 
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Table  1.  Crystallographic  Data 


Data  Set 

PDZ2 

PDZ2-$yndecan-4 

PDZ2-IL5R« 

Experimental  Data 

Wavelength  (X) 

1.54178 

0.97946 

0.97946 

0.97946 

Space  group 

P2i 

P2i 

C2 

C222, 

Unit  cell  parameters  (A,  °) 
a 

25.27 

25.29 

58.34 

53.72 

b 

42.54 

42.57 

54.44 

55.98 

c 

31.06 

31.04 

50.22 

51.09 

P 

108.8 

108.7 

98.7 

90.0 

Resolution  (A) 

25.0-1.60 

13.0-1.10 

50.0-1.85 

20.0-1.25 

(1.66-1.60)" 

(1.14-1.10) 

(1.92-1.85) 

(1.29-1.25) 

Total  reflections 

21,360 

58,393 

46,421 

155,256 

Unique  reflections 

6,600  (186) 

18,559  (275) 

12,987  (1,108) 

20,850(1,611) 

Completeness  (%) 

79.8  (22.9) 

73.3  (11.0)b 

97.4  (83.7) 

95.7  (75.0) 

R*m  (%Y 

5.8  (17.0) 

4.2  (17.0) 

4.9  (41.3) 

5.1  (49.6) 

Average  l/o-  (1) 

22.7  (3.62) 

28.5  (3.46) 

25.8  (2.86) 

33.5  (2.15) 

Refinement  Details 

Resolution  (A) 

21.27-1.60 

12.16-1.24 

49.39-1.85 

19.39-1.35 

Reflections  (working) 

6,290 

15,685 

12,001 

16,171 

Reflections  (test) 

307 

851 

985 

848 

Rwo*  (%)d 

11.9 

11.3 

17.5 

17.6 

Rfroe  (%Y 

16.6 

15.3 

22.6 

21.2 

Number  of  waters 

181 

173 

145 

144 

Rms  deviation  from  ideal  geometry 
Bonds  (A) 

0.015 

0.013 

0.012 

0.016 

Angles  (°) 

1.48 

1.63 

1.82 

2.13 

Average  B  factor  (A2) 

Main  chain 

11.42 

8.32 

16.01 

17.19 

Side  chain 

12.69 

10.15 

20.50 

21.50 

Waters 

26.41 

22.21 

39.86 

39.93 

"The  numbers  in  parentheses  describe  the  relevant  value  for  the  last  resolution  shell. 
b  Completeness  at  resolution  12.16-1.24  (1.30-1.24)  used  for  refinement  is  93.1%  (73.4%) 

"R.yn,  =  2|l,  -  <l>|/2l  where  I,  is  the  intensity  of  the  /th  observation  and  <l>  is  the  mean  intensity  of  the  reflections. 

dRwor*  “  2||Fobsl  -  |Fcaiell/2  (Fobs!,  crystallographic  R  factor,  and  R^  =  SUF^I  -  when  all  reflections  belong  to  a  test  set  of  randomly 

selected  data. 


plexes  with  an  IL5Ra  C-terminal  peptide  (ETLEDSVF) 
and  a  syndecan-4  peptide  (TNEFYA).  The  structures 
were  refined  to  1.24  A,  1.35  A,  and  1.85  A  resolution, 
respectively.  These  structures  show  how  syntenin’s 
PDZ2  can  accommodate  different  peptides  and  call  for 
a  revision  of  the  established  paradigm  of  PDZ  domain 
classification. 

Results  and  Discussion 

Syndecan  Binding  Involves  Interaction  with  Tyr_i 
The  crystal  structure  of  the  PDZ2  domain  with  a  bound 
syndecan-4  C-terminal  hexapeptide  was  refined  at 
1 .85  A  resolution  to  a  crystallographic  R  value  of  17.5% 
and  Rfre0  of  22.6%  (Table  1).  The  structure  contains  a 
noncrystallographic  dimer  of  PDZ2-peptide  complexes 
in  the  asymmetric  unit.  In  both  complexes,  the  structures 
of  the  bound  syndecan-4  peptides  were  identical  within 
experimental  error,  with  the  average  isotropic  tempera¬ 
ture  factor  (B  factor)  of  30.9  A2  and  31 .4  A2,  respectively. 
In  general,  the  interaction  between  the  PDZ2  and  the 
peptide  conforms  to  the  classical  model  of  a  strand 
insertion  between  the  |32  strand  and  a2  helix  of  the 
PDZ  domain  (Figure  1  A).  The  terminal  carboxylate  of  the 
peptide  accepts  three  hydrogen  bonds  from  the  amide 
nitrogens  of  Val209,  Gly210,  and  Phe211.  There  is  an 
additional  indirect  interaction  with  the  carbonyl  oxygen 


of  Gly207  through  an  ordered  water  molecule.  The  main 
chain  amide  of  the  C-termina!  residue  donates  a  hydro¬ 
gen  bond  to  the  carbonyl  oxygen  of  Phe211  in  the  p2 
strand.  The  carbonyl  oxygen  of  Phe  (P_2>  interacts  with 
the  amide  group  of  Phe21 3,  while  its  amide  donates  a 
hydrogen  bond  to  the  carbonyl  oxygen  of  the  same 
residue.  The  C-terminal  Ala  (P0)  and  Phe  (P_s)  of  the 
syndecan  peptide  interact  with  PDZ2  in  agreement  with 
the  canonical  model  of  class  II,  as  exemplified  by  the 
structure  of  hCASK  (Daniels  et  al.,  1998).  However,  the 
methyl  group  of  Ala  (P0)  is  much  smaller  than  the  size 
of  the  hydrophobic  pocket  at  S0,  which  is  formed  mostly 
by  Val209,  Phe211,  Phe213,  and  Leu258  (Figure  IB).  In 
contrast,  the  benzene  ring  of  Phe  (P_2>  fits  well  in  the 
corresponding  S_2  pocket  formed  by  Phe213,  Asp251, 
and  Ala255. 

Interestingly,  there  is  an  additional  interaction  involv¬ 
ing  Tyr  (P-i),  which  is  lodged  into  the  S_!  pocket  cush¬ 
ioned  by  His208,  Ileu21 2,  and  Val222.  The  aromatic  ring 
is  involved  in  an  off-center  stacking  interaction  with 
His208.  Other  syntenin  PDZ2  binding  proteins,  neurexin 
I  (class  II),  and  neuroglian  (class  I)  also  have  Tyr  in  the 
P—i  position  (Grootjans  et  al.,  2000;  Koroll  et  al.,  2001). 
The  equal  importance  of  the  Phe  (P-2)  and  Tyr  (P_i) 
interactions  for  recognition  by  syntenin’s  PDZ2  is  under¬ 
scored  by  studies  showing  that  mutations  to  Ala  of  either 
of  the  two  residues  abolish  binding  to  syntenin  (Groot- 
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Figure  1.  Comparison  of  Syntenin  PDZ2  Structures  Binding  Syndecan-4  Peptide  and  Interleukin  5  Receptor  a  Subunit 

(A)  Ribbon  diagram  of  the  syntenin  PDZ2  bound  to  the  syndecan-4  peptide  (TNEFYA).  A  2mF0  -  DFC  electron  density  map  calculated  at  1.85  A 
resolution  and  contoured  at  1.0a  is  shown  around  the  ligand. 

(B)  Molecular  surfaces  of  syntenin  PDZ2  showing  three  hydrophobic  binding  pockets  and  the  syndecan-4  peptide.  The  three  binding  pockets 
are  circled.  The  three  C-terminat  residues  are  shown  in  the  Ca  trace  cartoon  of  the  peptide.  The  side  chains  of  tyrosine  (-1)  and  phenylalanine 
(-2)  occupy  the  two  pockets  S_1  and  S_2(  while  alanine  (0)  only  occupies  a  portion  of  S0. 

(C)  Ribbon  diagram  of  the  syntenin  PDZ2  bound  to  the  interleukin  5  receptor  a  subunit  peptide  (ETLEDSVF).  A  2mF0  -  DFC  electron  density 
map  calculated  at  1.35  A  resolution  and  contoured  at  1.0a  is  shown  around  the  ligand. 

(D)  Molecular  surfaces  of  syntenin  PDZ2  showing  three  hydrophobic  binding  pockets  and  the  interleukin  5  receptor  a  subunit  peptide.  The 
three  binding  pockets  are  circled.  The  three  C-terminal  residues  are  shown  in  the  Ca  trace  cartoon  of  the  peptide.  The  side  chains  of 
phenylalanine  (0)  and  valine  (-1)  of  the  peptide  are  located  in  pockets  S0  and  S-u  while  that  of  serine  (-2)  is  out  of  pocket  S_2.  The  same 
orientation  is  used  for  (A)  and  (B)  or  (C)  and  (D).  Figures  were  made  using  MOLSCRIPT  (Kraulis,  1991)  and  Pymol  (DeLano  Scientific).  Strand 
pi  (197-203);  p2  (210-214);  P3  (216-222);  04  (234-242);  p5  (244-246);  (36  (A263-270);  Helix  al  (225-231);  a2  (250-260). 


jans  et  al.,  1997).  Thus,  the  interaction  of  PDZ2  with 
syndecan  depends  primarily  on  the  side  chains  of  resi¬ 
dues  in  P_i  and  P_2  rather  than  P0  and  P_2,  as  the  classi¬ 
cal  model  implies. 

The  Interaction  of  ILSRa  with  Syntenin 
Does  Not  Involve  Ser_2 

The  interaction  of  syntenin  with  IL5Ra  was  originally 
reported  as  class  I,  because  the  C-terminal  sequence 


of  the  receptor  has  Ser  in  the  P_2  position  and  Phe  in 
the  P0  position.  However,  the  sequences  of  syntenin 
PDZ  domains  do  not  resemble  a  typical  class  I  domain. 
In  particular,  there  is  a  notable  absence  of  a  histidine 
at  the  beginning  of  helix  a2,  which  normally  hydrogen 
bonds  to  the  hydroxyl  of  Ser  or  Thr  (P^2).  In  an  effort  to 
characterize  the  details  of  the  IL5Ra  interactions  with 
syntenin,  we  crystallized  and  solved  the  structure  of 
the  PDZ2  domain  with  an  octapeptide  derived  from  the 
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Figure  2.  Syntenin  PDZ2  Interaction  with  Its  Neighbor  Molecular 
Shows  Novel  PDZ  Binding 

(A)  Crystal  packing  of  syntenin  PDZ2.  Each  C  terminus  serves  as  a 
ligand  for  a  neighboring  PDZ2  molecule. 

(B)  Ribbon  diagram  of  syntenin  PDZ2  bound  to  the  C-terminal  and 
Internal  sequences  of  its  neighbor  molecule.  A  2mF0  -  DFC  electron 
density  map  calculated  at  1.24  A  resolution  and  contoured  at  I.Ckr 
around  the  residues  bound  in  the  PDZ  domain. 

(C)  Molecular  surfaces  of  syntenin  PDZ2  showing  three  hydrophobic 
binding  pockets  and  the  residues  of  neighboring  molecule  binding 
at  the  pockets.  The  three  binding  pockets  are  circled.  The  three 


C-terminal  sequence  of  ILSRa.  The  crystals  of  the  com¬ 
plex  allowed  for  X-ray  data  collection  to  a  resolution  of 
1 .35  A  (Table  1 ).  The  atomic  model,  refined  to  a  crystallo¬ 
graphic  R  value  of  17.6%  (Rfree  21.2%),  shows  how  the 
C-terminal  carboxylate  group  of  Phe  (P0)  of  the  peptide 
is  bound  in  a  way  analogous  to  that  seen  in  the  syndecan 
complex,  while  its  benzene  ring  fills  the  S0  pocket  (Fig¬ 
ures  1 C  and  1 D).  However,  Ser  (P-2)  does  not  directly 
interact  with  the  PDZ2  domain  as  suggested  by  the 
classical  model.  The  side  chain  of  Val  (P^)  fits  into  the 
hydrophobic  S_i  pocket,  and  the  carbonyl  oxygen  of 
Ser  (P-2)  is  hydrogen  bonded  through  a  water  molecule 
to  the  Ile21 2  main  chain  nitrogen.  There  are  no  further 
interactions,  and  the  electron  density  for  the  peptide 
fades  beyond  P_4.  Except  for  the  interaction  of  Phe 
(P0),  the  peptide’s  backbone  does  not  fully  occupy  the 
binding  groove  as  is  seen  in  other  complexes,  leaving 
the  S— 2  site  empty.  Similar  interaction  of  these  three 
C-terminal  residues  was  also  found  in  the  crystal  struc¬ 
ture  of  the  syntenin  PDZ  tandem- IL5Ra  peptide  complex 
(data  not  shown,  our  PDB  entry  10BZ).  It  has  been 
shown  by  mutational  studies  that  with  the  exception 
of  the  C-terminal  Phe,  no  other  residue  is  vital  for  the 
interaction  of  ILSRa  with  syntenin  (Geijsen  et  al.,  2001). 
This  is  in  agreement  with  our  results  but  does  not  sup¬ 
port  the  classical  class  I  recognition  mechanism.  In  con¬ 
trast,  the  structure  reveals  some  similarities  to  the  con¬ 
formation  observed  in  erbin-ErbB2  peptide  complex 
(Birrane  et  al.,  2003).  Failure  of  Val  (P-j  to  form  a  typical 
class  I  interaction  with  His  at  a2  helix  causes  the  dis¬ 
placement  of  the  peptide  backbone  away  from  the  a 
helix.  In  the  complex  of  erbin  and  phosphorylated  ErbB2 
peptide  (EpYLGLDVPV)  complex,  no  density  is  observed 
beyond  P_5,  leaving  its  other  binding  pocket  at  the  p2- 
p3  loop  empty. 

The  Structure  of  Unbound  PDZ2  Suggests 
Additional  Recognition  Mechanisms 
PDZ  domains  show  high  affinity  toward  terminal  car¬ 
boxyl  groups  of  peptides,  and  in  the  absence  of  target 
peptides,  isolated  PDZ  domains  will  often  bind  their  own 
C-terminal  tails.  The  crystal  structures  of  the  hCASK 
PDZ,  NHERF  PDZ1,  and  GRIP1  PDZ6  domains  show 
how  the  peptide  binding  grooves  are  occupied  by  the 
C-terminal  tails  of  neighboring  molecule,  mimicking  the 
recognition  of  the  peptide  ligand  (Daniels  et  al.,  1998; 
Im  et  al.,  2003;  Karthikeyan  et  al.,  2001a).  In  a  similar 
way,  the  crystal  structure  of  the  uncomplexed  PDZ2 
domain,  refined  at  1 .24  A  resolution  to  a  crystallographic 
R  value  of  11.3%  (Rfree  15.3%),  shows  an  interesting 
interaction  between  adjacent  molecules  (Table  1  and 
Figure  2A).  This  interaction  suggests  that  PDZ2  is  capa¬ 
ble  of  yet  another  mode  of  molecular  recognition,  in 
addition  to  the  classic  mode  of  strand  insertion. 

The  PDZ2  construct  used  here  terminates  at  residue 
Phe273,  rather  than  Met270,  as  is  the  case  with  the 


residues  are  shown  in  the  Ca  trace  cartoon  of  bound  residues.  The 
side  chains  of  phenylalanine  (0)  and  alanine  (-1)  of  the  C  terminus 
reside  in  pockets  S0  and  S_i,  while  that  of  methionine  (3)  is  in  pocket 
S_2.  The  same  orientation  is  used  for  (B)  and  (C).  Figures  were  made 
using  MOLSCRIPT  (Kraulis,  1991)  and  Pymol  (DeLano  Scientific). 
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crystal  structures  of  the  two  above  described  com¬ 
plexes.  (Chronologically,  this  structure  was  done  first, 
and  the  construct  was  then  truncated  to  circumvent 
inter-PDZ2  interactions  and  inhibition  of  peptide  bind¬ 
ing.)  One  molecule  of  PDZ2  binds  the  C-terminal  Phe 
of  its  crystallographic  neighbor  in  a  manner  identical  to 
that  observed  for  the  IL5Ra  (Figures  2B  and  2C).  The 
preceding  residues  are  out  of  the  binding  groove,  and 
there  is  no  interaction  of  P_2  at  S_2.  Compared  to  either 
the  syndecan  or  IL5Ra  bound  structures,  the  binding 
groove  is  not  altered  except  for  the  side  chain  of  Ile21 2, 
which  rotates  to  accommodate  the  C  terminus  of  the 
neighboring  molecule.  In  addition  to  the  interaction  in¬ 
volving  the  C-terminal  Phe,  the  N-terminal  portion  of  the 
same  molecule  also  interacts  with  the  binding  groove. 
This  is  possible  because  the  C  terminus  and  N  terminus 
of  PDZ2  are  close  to  each  other  and  form  a  structural 
epitope.  The  side  chain  of  the  third  residue  Met  binds 
into  the  S-2  site,  so  that  its  methyl  group  is  in  a  very 
close  contact  with  the  aromatic  ring  of  Phe213.  A  salt 
bridge  forms  between  carboxyl  group  of  an  Asp,  which 
follows  the  Met,  and  amino  group  of  Lys21 4  in  {32  strand 
of  the  adjacent  molecule.  Thus,  in  this  case,  syntenin’s 
PDZ2  domain  shows  affinity  for  a  structural  epitope, 
rather  than  a  sequence  motif.  It  is  very  probable  that 
this  mode  of  recognition  also  occurs  in  nature,  and  that 
the  two  binding  sites  (So  and  S_j >)  may  bind  amino  acids, 
which  need  not  be  adjacent  within  a  short  peptide. 


Molecular  Basis  of  Recognition  by  PDZ  Domains 
Our  results,  in  conjunction  with  existing  literature,  call 
into  question  the  utility  of  the  current  model  of  protein 
recognition  by  PDZ  domains  and  of  the  rigid  classifica¬ 
tion  of  PDZ  domains  based  on  the  identity  of  P0  and  P_2 
residues  of  a  target  peptide  (Figure  3A).  As  shown  here, 
syntenin  PDZ2  has  three  distinct  binding  pockets  (S0, 
S-i,  and  S-jj),  and  the  interaction  of  the  P_,  residue  at 
the  S_i  site  is  as  important  as  the  canonical  interactions 
at  the  S0  and  S_2  sites.  Therefore,  P_i  interaction  should 
be  included  as  a  general  feature  of  PDZ  interaction  (Fig¬ 
ure  3B).  The  importance  of  the  P_,  binding  is  also  appar¬ 
ent  from  studies  of  other  PDZ  domains.  Both  LIN-2  and 
p55  PDZ  domains  bind  peptides  where  all  three  terminal 
residues  are  hydrophobic,  including  aromatic  side  chains 
at  both  P_,  and  P-2  (Songyang  et  al.,  1997).  By  phage- 
displayed  peptide  library  screening,  PDZ2  of  MAGI3  se¬ 
lects  Trp  as  P_!  and  the  side  chain  of  this  Trp  is  critical 
for  high-affinity  binding  (Fuh  et  al.,  2000).  Model  studies 
suggest  that  the  side  chain  at  P-t  position  reaches 
across  the  (32  strand  and  makes  specific  contacts  with 
side  chains  in  the  (33  strand.  The  binding  specificity 
studies  of  hINADL  reveal  that  PDZ1,  2,  3,  and  4  belong 
to  class  II  while  PDZ5,  6,  and  7  are  class  I  PDZ  domains 
(Vaccaro  et  al.,  2001).  However,  except  for  PDZ7,  all 
domains  in  hINADL  have  some  selectivity  for  P_n  and 
the  site-directed  mutagenesis  of  the  residues  in  (33  of 
hINADL  PDZ7  alters  the  selectivity  for  P^.  The  atomic 
model  of  the  NHERF  PDZ  domain  complexed  with  the 
-Q-D-T-R-L  target  sequence  also  reveals  recognition 
mediated  by  the  P residue:  the  Arg  side  chain  in  this 
position  interacts  intimately  with  Asn22  and  Glu43  of 
the  PDZ  (Karthikeyan  et  al.,  2001  b);  these  residues,  lining 


the  S_t  pocket,  are  equivalent  to  His208  and  Val222  of 
syntenin  PDZ2. 

Another  example  is  the  erbin  PDZ  domain,  normally 
defined  as  a  class  I  because  of  the  His  at  the  beginning 
of  the  gl2  helix  and  because  of  the  target  motif  -S/T- 
Interestingly,  this  domain  does  not  bind  ErbB4 
(-N-T-V-V)  (Borg  et  al.,  2000),  but  interacts  with  the  C 
termini  of  8-catenin,  p0071,  and  ARVCF,  which  all  share 
the  sequence  -D-S-W-V  (Laura  et  al.,  2002).  Further¬ 
more,  Trp  was  exclusively  selected  for  P_!  for  the  erbin 
PDZ  domain  by  phage  display  peptide  library  screening. 
The  NMR  solution  structure  of  the  erbin  PDZ  domain 
with  the  phage-optimized  peptide  (AcTGWETWV)  re¬ 
veals  canonical  interactions  of  P0  and  P_2  residues,  as 
well  as  an  additional  interaction  involving  Trp  (P-i), 
whose  side  chain  reaches  across  strand  (32  and  inserts 
between  the  side  chain  of  Arg49  and  Gln51  at  the  end  of 
(33  strand  (Skelton  et  al.,  2003).  Clearly,  a  more  accurate 
description  of  the  target  motif  for  erbin  would  be  -S/T- 
W-<£,  highlighting  the  preference  at  P_t  for  Trp. 

In  syntenin’s  PDZ2,  all  three  binding  pockets  (S0,  S_1f 
and  S-J  have  an  apparent  affinity  for  hydrophobic  resi¬ 
dues  (-<£-$-<£).  These  pockets  function  in  a  combinato¬ 
rial  way  to  bind  peptides  from  different  targets.  This 
interplay  of  the  three  sites  appears  to  endow  PDZ2  with 
the  ability  to  bind  diverse  but  specific  sequence  motifs. 
The  complex  of  syntenin’s  PDZ2  domain  with  synde- 
can-4  peptide,  previously  classified  as  class  II  interac¬ 
tion,  involves  the  recognition  of  two  penultimate  resi¬ 
dues,  Tyr  (P_0  and  Phe  (P-J,  but  not  of  the  side  chain 
of  P0  (-<£-<£-X)  (Figures  1 B  and  3C).  On  the  other  hand, 
the  interaction  with  the  IL5Ra  peptide  does  not  involve 
the  P_2  side  chain  hydroxyl,  unlike  other  PDZ  domains 
that  interact  with  class  I  peptides  (Figure  1 D).  Therefore, 
defining  the  ILSRa  peptide  as  class  I  partner  for  syntenin 
is  questionable.  Our  data  suggest  that  syntenin’s  PDZ2 
interactions  with  other  so-called  class  I  peptides  may 
also  involve  binding  of  P0  and  P_i  (-X-c p-cf>)  instead  of 
P0  and  P_2  as  the  classical  model  requires  (Figure  3D). 
All  of  syntenin’s  partners  known  to  bind  to  PDZ2  have 
hydrophobic  residues  in  P^. 

Our  combinatorial  model  incorporates  the  classical 
P0  and  P-2  interactions  (-cp-X-cp)  but  accounts  for  all 
three  S  sites  (-<£-<£-<£)  (Figures  3A  and  3B).  We  expect 
that  the  former  mode  expains  syntenin’s  interaction  with 
ephrin  B  (-Y-Y-K-V),  while  the  latter  may  apply  to  neu- 
rexin  (-E-Y-Y-V).  This  model  could  also  explain  the  dual 
specificity  observed  for  other  PDZ  domains.  For  exam¬ 
ple,  the  CIPP  PDZ3  domain  binds  NMDA  receptors 
(class  I)  and  neurexin  (class  II).  Although  the  former 
target  has  a  sequence  -E-S-D/E-V,  the  CIPP  PDZ3  do¬ 
main  does  not  have  a  His  at  a2  and  it  does  not  bind  a 
related  peptide  from  neuroligin2  (-T-T-R-V)  (Kurschner 
et  al.,  1 998),  indicating  that  the  P_2  position  is  not  critical, 
in  contrast  to  canonical  class  I  interaction.  The  SMART 
database  places  this  particular  PDZ  domain  in  a  group 
that  binds  a  motif  -'k-D/E-c/>  (Bezprozvanny  and  Max¬ 
imov,  2001).  Applying  the  combinatorial  three-pocket 
model  helps  understand  how  it  can  bind  NMDA  recep¬ 
tors  using  P_!  and  P0  (-S-D/E-V)  and  neurexin  using  P_2 
and  P0  (-Y-Y-V). 

Aside  from  explicit  examples  of  the  involvement  of 
the  P-t  residue— unaccounted  for  in  the  classical 
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Figure  3.  Schematics  of  PDZ  Interactions 

(A)  Canonical  PDZ  binding  of  C-termina!  se¬ 
quence  depends  on  the  residues  P0  and  P_2 
binding  pockets  So  and  $_2. 

(B)  All  three  C-terminal  residues  are  Involved 
In  the  interaction  to  PDZ  binding  groove. 

(C)  The  C-terminal  binding  depends  on  the 
binding  at  S_,  and  S_2  of  PDZ  as  seen  in 
PDZ2-syndecan-4  peptide  complex. 

(D)  The  C-terminal  binding  depends  on  the 
binding  at  So  and  S_,  as  seen  in  syntenin 
PDZ2-interieukin  5  receptor  a  subunit  pep¬ 
tide  complex. 

(E)  Syntrophin  PDZ  interaction  by  the  resi¬ 
dues  from  p -finger  conformation  of  nNOS. 

(F)  Interaction  of  internal  residue  at  pocket 
S_2  while  C-terminal  residues  binds  at  So  as 
seen  in  syntenin  PDZ2-PDZ2  interaction. 


model  —there  are  examples  of  the  P_2  residue  not  being 
involved,  as  in  the  interaction  of  syntenin’s  PDZ2  with 
IL5Ra.  The  so-called  class  IV  or  class  III  interactions, 
targeting  sequences  -X-'P-D/E  or  -X-W-C,  do  not  show 
specificity  for  P_2  (Bezprozvanny  and  Maximov,  2001 ; 
Maximov  et  al.,  1 999).  It  is  not  surprising  that  the  degen¬ 
erate  specificity  is  common  among  these  PDZ  domains. 
Mintl  PDZ1,  which  has  dual  specificity  for  sequences 
-E/D-X-W-C/S  and  -E-Y-Y-V,  could  bind  both  types  of 
sequences  by  interacting  with  P^/Po  and  P-j/P-i  pairs, 
respectively.  The  ability  of  the  fifth  PDZ  domain  of  hl- 
NADL  (Vaccaro  et  al.,  2001)  and  PAR6  PDZ  domain  to 
bind  the  sequences  -D-H-W-C  and  -E-Y-Y-V  may  be 
rationalized  in  a  manner  similar  to  the  Mintl  PDZ1  inter¬ 
action.  However,  there  may  be  additional  interaction 
involving  the  conserved  P_3,  residue,  D/E. 

We  believe  that  all  three  S  sites  described  here  are 
key  determinants  in  the  PDZ  complex  recognition  pat¬ 
tern,  and  any  general  model  should  include  all  three. 
The  binding  modes  of  known  PDZ  domains,  according 
to  our  combinatorial  model  depicted  in  Figure  3,  are 
summarized  in  Table  2. 

There  is  also  accumulating  evidence  that  residues 
upstream  of  the  C-terminal  tripeptide  of  the  target  may 
also  be  involved  in  the  recognition  process,  putting  the 
classical  model  in  even  greater  peril  (Laura  et  al.,  2002; 
Songyang  et  al.,  1997;  Vaccaro  et  al.,  2001).  Although 
we  found  no  interaction  with  P_3  in  the  syntenin  PDZ2 
structures,  the  specificity  for  P_3  is  often  observed. 
Some  PDZ  domains,  which  have  long  (32  strand  or  (32-(33 


loop,  have  further  interaction  to  its  target  peptide  at  this 
region  (Birrane  et  al.,  2003;  Kozlov  et  al.,  2002;  Walma 
et  al.,  2002).  We  have  shown  that  the  PDZ1  domain  of 
syntenin,  which  also  has  a  long  p2-p3  loop,  recognizes 
residues  upstream  of  the  terminal  hexapeptide  of  merlin, 
as  exemplified  by  a  significantly  higher  affinity  for  octa- 
than  hexapeptide  (Kang  et  al.,  2003).  However,  syntenin 
PDZ2  has  a  short  p2-p3  loop  like  PSD-95  PDZ1  and 
NHERF  PDZ1 ,  and  these  domains  do  not  interact  with 
the  bound  peptides  at  this  loop  (Doyle  et  al.,  1 996;  Kar- 
thikeyan  et  al.,  2001b,  2002).  Thus,  the  interaction  in¬ 
cluding  P_3  appears  to  be  optional,  and  it  could  enhance 
the  binding  in  the  absence  of,  or  in  addition  to,  strong 
binding  at  C-terminal  three  residues. 

Finally,  the  recent  erbin  structures  show  how  a  binding 
pocket  can  be  targeted  by  a  residue  that  does  not  oc¬ 
cupy  the  expected  sequence  position.  The  crystal  struc¬ 
ture  with  ErbB2  peptide  shows  that  Tyr_7  of  the  peptide 
binds  at  a  site  within  the  p2-p3  loop  (Birrane  et  al.,  2003). 
Interestingly,  in  the  structure  with  the  phage  peptide 
(AcTGWETWV),  Trp_4  interacts  at  the  same  binding  site 
(Skelton  et  al.,  2003).  This  implies  that  a  structural  epi¬ 
tope  is  more  important  than  the  sequence  for  PDZ-pep- 
tide  interaction.  As  shown  by  our  structure  of  syntenin’s 
PDZ2  and  its  interaction  with  its  neighboring  molecule, 
the  residues  far  from  the  C  terminus  in  sequence  can 
be  involved  in  binding  to  the  S_2  pocket  by  forming  a 
contiguous  structural  epitope  with  the  C  terminus.  There 
is  also  the  exceptional  example  of  the  recognition  of 
nNOS  (3-finger,  not  C  terminus,  by  syntrophin  (Figure 
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Table  2.  Proposed  Binding  Modes  of  PDZ  Domains  by  the  Combinatorial  Model 

PDZ  Domains 

The  C-Termina!  Sequence  Motifs 
of  Representive  Ugands 

Binding  Mode* 

References 

Class  1  (-S/T-X-tf>)b 

PSD-95  PDZ3 

-T-S-V 

Ac 

Doyle  et  al.,  1996 

NHERF  PDZ1 

-T-R-L,  -S-L-L,  -S-F-L,  -E-Q-L 

A°,  Bc 

Karthikeyan  et  al.,  2001a,  2001b, 

2002 

Erbln 

-T-W-V,  -V-P-V,  -S-W-V 

Ac,  Bc 

Borg  et  al.,  2000;  Laura  et  al.,  2002; 
Skelton  et  al.,  2003 

hPTPIE 

-S-A-V 

Ac 

Kozlov  et  al.,  2002 

Syntrophin 

-S/T-X-V,  S-L-V,  T-T-Fd 

Ac,  B“ 

Hillier  et  al.,  1 999;  Schultz  et  al.,  1 998 

MAG13  PDZ2 

-S/T-W-V 

B 

Fuh  et  al.,  2000 

CIPP  PDZ3 

Class  II  (-<#>-X-<M 

-S-D/E-V,  -Y-Y-V 

A,  C 

Bezprozvanny  and  Maximov,  2001; 
Kurschner  et  al.,  1 998 

hCASK 

-R-E-F,  -F-Y-A 

ASC 

Daniels  et  al.,  1998 

Syntenin 

-F-Y-A,  -S-V-F,  -Y-Y-V,  -Y-K-V 

A,  B,  CS  D° 

Geijsen  et  al.,  2001 ;  Grootjans  et  al., 

1997,  2000;  Torres  et  al.,  1998; 
this  study 

GRIP  PDZ6 

-Y-S-C 

Bc 

Im  et  al.,  2003 

P55 

-Y-F-l,  -F-X-X,  -Y-Y-F 

B,  C 

Songyang  et  al.,  1997 

UN-2 

-F-F-V/F/A 

B,  C 

Songyang  et  al.,  1997 

Class  III  (-D/E-X-0) 

nNOS 

-D-S-V 

Ad 

Tochio  et  al.,  1999 

Class  IV  (-X-'P-D/E,  -X-W-C/S) 

hINADL  PDZ3 

-W-D-V,  -Y-D-W,  S-W-E,  -S-Y-E 

B,  D 

Vaccaro  et  al.,  2001 

Mintl  PDZ1 

-D-W-C,  -H-W-C,  -Y-Y-V 

C,  D 

Bezprozvanny  and  Maximov,  2001 ; 
Maximov  et  al.,  1999 

PAR6 

-H-W-C,  -Y-Y-V 

C,  D 

Bezprozvanny  and  Maximov,  2001 

hINADL  PDZ5 

-H-W-C,  -Y-Y-V,  -V-F-V 

C,  D 

Bezprozvanny  and  Maximov,  2001 ; 

Vaccaro  et  al.,  2001 

"A,  canonical  P0  and  P^z  binding;  B,  P0,  P_1f  and  P_2  binding;  C,  P and  P_2  binding;  D,  P0  and  P_i  binding,  as  shown  in  Figure  3. 
bX  denotes  any  amino  acid;  <f>  denotes  a  hydrophobic  residue;  denotes  an  aromatic  residue. 

“Structure  was  solved  by  X-ray  or  NMR. 
dp-finger  binding. 


3E;  Hillier  et  al.,  1 999),  but  the  mode  we  suggest  implies 
involvement  of  internal  sequences  in  addition  to  the 
interaction  of  the  C  terminus  in  PDZ  interaction  (Fig¬ 
ure  3F). 

One  of  the  benefits  of  PDZ  classification  is  the  poten¬ 
tial  ability  to  predict  the  binding  partners  for  a  given 
PDZ  domain.  However,  even  a  correctly  predicted  single 
C-terminal  sequence  motif  may  not  be  enough  to  deter¬ 
mine  the  binding  capacity  of  any  given  PDZ  domain.  For 
successful  prediction  of  multiple  binding  partners  for 
PDZ  domains,  it  would  be  better  to  characterize  the 
PDZ  domain  by  the  specificity  of  its  binding  pockets, 
especially  the  three  pockets  for  the  C-terminal  residues, 
and  account  for  the  likely  combinations. 

Biological  Implications 

Protein-protein  interactions  are  pivotal  to  cell  signaling 
events.  The  PDZ  domain  is  the  most  ubiquitous  protein- 
protein  interaction  module  found  in  the  human  genome, 
with  nearly  500  copies.  Numerous  multidomain  cytosolic 
proteins  contain  PDZ  domains  and  bind  to  receptors  and 
channels,  thereby  serving  as  a  membrane-associated 
scaffold  for  the  assembly  of  signaling  complexes.  Syn- 
tenin  is  a  ubiquitous  protein  containing  two  PDZ  do¬ 
mains  and  is  involved  in  protein  targeting  and  multipro¬ 
tein  assembly.  Notably,  it  is  overexpressed  in  gastric 
and  breast  cancer  cell  lines,  suggesting  that  its  function 
contributes  to  cytoskeleton  regulation  and  cell  migra¬ 
tion.  Syntenin  binds  biologically  important  receptors 


such  as  IL5R  and  syndecan.  We  report  the  crystal  struc¬ 
tures  of  complexes  of  the  second  PDZ  domain  of  syn¬ 
tenin,  residues  113-270,  with  the  C-terminal  peptides 
of  IL5Ra  and  syndecan,  solved  at  1 .35  A  and  1 .85  A 
resolution,  respectively.  These  structures  reveal  how 
one  PDZ  domain  interacts  with  different  C-terminal  se¬ 
quences  of  binding  targets.  Syndecan  binds  syntenin 
mainly  by  its  C-terminal  P ^  and  P_2  residues,  and  IL5Ra 
interacts  through  its  C-terminal  P0  and  P_t  residues. 
These  results  not  only  extend  the  knowledge  of  PDZ- 
ligand  recognition  of  specific  targets  but  also  explain 
the  general  scheme  underlying  degenerate  specificity. 
Furthermore,  the  mode  of  syntenin  PDZ2  interaction 
with  its  neighbor  molecule  in  a  crystal  of  the  unbound 
PDZ2  domain  (1 .24  A  resolution)  suggests  the  impor¬ 
tance  of  a  structural  epitope  for  PDZ  interactions  rather 
than  a  sequence  motif.  Based  on  our  results  and  the 
results  published  elsewhere,  we  propose  the  combina¬ 
torial  model  that  generalizes  the  PDZ-ligand  interac¬ 
tions.  The  new  model  is  likely  to  predict  the  possible 
binding  of  biologically  important  target  molecules  more 
accurately  than  the  current  model. 

Experimental  Procedures 

Protein  Expression  and  Purification 

A  syntenin  clone  was  obtained  from  American  Tissue  and  Culture 
Collection  (ATCC  72537).  The  DNA  fragment  coding  for  PDZ2  (1 97- 
273)  was  amplified  by  PCR  and  cloned  into  a  GST-fusion  expression 
vector  containing  the  TEV  (tobacco  etch  vims)  protease  cleavage 


Structure 

852 


site  (Sheffield  et  a!.,  1999).  This  construct  was  used  for  expression 
of  protein  samples  for  the  crystallization  of  uncomplexed  PDZ2.  To 
obtain  a  shorter  version  of  PDZ2  (197-270),  a  stop  codon  (TGA)  was 
introduced  after  Met270  by  the  QuikChange®  method  (Stratagene). 
This  step  was  necessary  to  prevent  intermolecular  interactions  be¬ 
tween  the  PDZ2  domains,  in  which  one  molecule  bound  another  via 
the  terminal  Phe  (see  text).  Both  versions  of  the  PDZ2  domain  were 
expressed  in  E  coll  BL21  strain  (Stratagene)  and  purified  using 
glutathione-Sepharose  4B  column  (Amersham  Pharmacia  Biotech). 
The  eluted  recombinant  protein  was  subjected  to  a  HiPrep  26/10 
desalting  column  (Amersham  Pharmacia  Biotech)  equilibrated  with 
50  mM  Tris-HCI  (pH  7.5),  1 50  mM  NaCI  (buffer  A)  to  remove  glutathi¬ 
one.  After  complete  digestion  with  rTEV  protease  (Life  Technologies) 
at  10°C  in  the  presence  of  0.5  mM  EDTA  and  1  mM  DTT,  the  protein 
solution  was  passed  again  through  a  glutathione  Sepharose  4B 
column  and  the  flow-through  was  concentrated  and  loaded  on  a 
Superdex  G75  column  (Amersham  Pharmacia  Biotech)  equilibrated 
with  buffer  A.  The  protein  fractions  were  collected  and  concentrated. 
All  the  purification  steps,  except  the  tTEV  digestion,  were  performed 
at  4°C.  The  purified  PDZ2  domain  contains  an  additional  pentapep- 
tide  (GAMDP)  at  the  N  terminus  due  to  the  cloning  procedure. 

Crystallization 

Initial  search  for  crystals  of  PDZ2  (1 97-273)  was  carried  out  with 
Crystal  Screen  1  (Hampton  Research,  Inc.)  using  the  hanging  drop 
vapor-diffusion  technique  at  294 K.  The  best  crystals  of  PDZ2  were 
obtained  with  8  mg/ml  protein  concentration  at  0.1  M  HEPES  (pH 
7.0),  34%  PEG4000  using  the  sitting  drop  vapor-diffusion  method 
with  microseeding.  For  the  crystallization  of  complexes  of  short 
PDZ2  (197-270)  with  peptides,  we  used  Additive  screen  I  (Hampton 
Research,  Inc.)  for  additional  screening.  The  complex  of  PDZ2  and 
syndecan-4  peptide  was  crystallized  from  0.1  M  HEPES  (pH  6.8), 
1.6  M  ammonium  sulfate,  20  mM  CoCI2,  and  0.2  M  MgSO«,  using 
1 :2  molar  mixtures  of  protein  and  peptide.  The  best  crystal  of  PDZ2 
with  the  IL5Ra  peptide  was  obtained  from  0.1  M  HEPES  (pH  7.0), 
1.6  M  ammonium  sulfate,  20  mM  CoCI2,  and  0.2  M  MgS04  by 
microseeding.  Synthesized  octapeptide  of  IL5Ra  (ETLEDSVF)  and 
hexapeptide  of  syndecan-4  (TNEFYA)  were  purchased  from  Bio- 
Synthesis  and  UVA  Biomolecular  Research  Facility,  respectively. 

Data  Collection,  Structure  Determination,  and  Refinement 
Crystals  were  frozen  in  liquid  nitrogen.  Those  containing  peptides 
were  briefly  soaked  in  the  crystallization  buffer  containing  17.5% 
(v/v)  glycerol  and  peptide  before  freezing.  An  initial  data  set  was 
collected  with  an  R-Axis  IV  detector  and  a  Nonius  FR591  generator 
equipped  with  Osmic  confocal  mirrors.  Subsequent  data  were  col¬ 
lected  at  beamline  X9B  at  NSLS  with  a  wavelength  of  0.97946  A 
under  cryoconditions  using  an  ADSC  Quantum4  CCD.  Data  sets 
were  processed  and  scaled  using  HKL2000  (Otwinowski  and  Minor, 
1 997).  Crystallographic  details  including  unit  cells  and  data  statistics 
are  shown  in  Table  1 .  All  structures  were  solved  by  the  molecular 
replacement  method  using  AMORE  (Navaza,  1994)  and  the  atomic 
models  were  refined  with  REFMAC5  (Murshudov  et  al.,  1997)  from 
the  CCP4  suite  (CCP4, 1994).  The  atomic  coordinates  of  the  PDZ2 
domain  derived  from  the  structure  of  the  PDZ  tandem  (entry  1 N99 
in  the  PDB)  were  used  as  initial  model  for  molecular  replacement 
of  the  home  source  data  set  of  PDZ2.  The  PDZ2  structure  refined 
to  1 .60  A  was  subsequently  used  as  a  model  for  molecular  replace¬ 
ment  of  the  other  data  sets  collected  using  synchrotron  radiation. 
Manual  model  building  was  performed  in  O  (Jones  et  al.,  1991).  The 
final  models  agree  well  with  known  protein  geometry.  Details  of 
refinement  are  given  in  Table  1 .  Compared  to  unbound  PDZ2  in  the 
PDZ  tandem  structure,  the  rms  differences  for  Ca  atoms  of  PDZ2 
structures  with  syndecan,  IL5Ra  peptide,  and  alone  but  interacting 
with  the  neighboring  molecule,  are  0.39  A,  0.44  A,  and  0.39  A,  respec¬ 
tively,  indicating  that  the  bound  peptide  causes  no  significant  struc¬ 
tural  changes  in  the  PDZ  domain. 
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The  crystal  structure  of  the  second  PDZ  domain  of  the  scaffolding  protein 
syntenin  was  solved  using  data  extending  to  0.73  A  resolution.  The 
crystallographic  model,  including  the  hydrogen  atoms  and  the  anisotropic 
displacement  parameters,  was  refined  to  a  conventional  R-factor  of  7.5% 
and  JR of  8.7%,  making  it  the  most  precise  crystallographic  model  of  a 
protein  molecule  to  date.  The  model  reveals  discrete  disorder  in  several 
places  in  the  molecule,  and  significant  plasticity  of  the  peptide  bond, 
with  some  co  angles  deviating  by  nearly  20°  from  planarity.  Most  hydrogen 
atoms  are  easily  identifiable  in  the  electron  density  and  weak  hydrogen 
bonds  of  the  C-H*  *  -O  type  are  clearly  visible  between  the  (3-strands.  The 
study  sets  a  new  standard  for  high-resolution  protein  crystallography. 
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Introduction 

For  several  decades  after  the  pioneering  diffrac¬ 
tion  experiments  with  wet  pepsin  crystals  in  1934 
by  J.  D.  Bernal  and  D.  Crowfoot,1  crystallographers 
studying  macromolecules  maintained  firmly  that 
protein  crystals  diffract  weakly  to  a  limited 
resolution,  with  only  few  exceptions  of  very  small 
proteins.  This  problem  was  attributed  to  such 
factors  as  intrinsic  disorder  in  protein  crystals, 
large  volumes  filled  with  liquid  solvent  and  the 
size  of  the  unit  cell.  The  advent  of  synchrotron 
radiation2  changed  this  perception  to  some  degree, 
but  the  advantage  of  high  flux  was  typically 
applied  to  improve  the  data  for  poorly  diffracting 
crystals,  rather  than  to  push  the  experimental 
envelope  for  crystals  that  were  of  good  quality  to 
begin  with.  Obtaining  data  to  a  resolution  within 
1.5-2.0A  was  critical,  as  it  allowed  for  the 
application  of  emerging  refinement  methods.3  It 
was  noted,  however,  that  a  number  of  protein 
crystals  yielded  diffraction  to  an  unexpectedly 
high  resolution,  even  though  other  technical 
difficulties  precluded  full  utilization  of  that  poten- 
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tial.  For  example,  the  tetragonal  crystals  of  the 
Bacillus  cereus  phospholipase  C  gave  very  strong 
Bragg  reflections  beyond  1.0  A  resolution, 
observed  on  still  photographs  recorded  at  beam¬ 
line  9.6  in  Daresbury  (Z.S.D.,  unpublished  results). 
Unfortunately,  it  was  neither  practical  nor  feasible 
in  the  1980s  to  collect  atomic-resolution  data  on 
film,  using  a  rotation  camera,  and  at  4  °C,  so  a  com¬ 
promise  limit  of  1.5  A  was  used  in  that  case 
instead.4  In  subsequent  years,  the  introduction  of 
digitally  read  imaging  plates  eventually  permitted 
fine-slicing  and  adequate  spatial  resolution  of 
reflections  at  high  diffraction  angles,  while  cryo- 
crystallography5  dispensed  with  the  need  to 
merge  data  from  a  number  of  crystals,  which  rou¬ 
tinely  died  from  radiation  damage  after  a  short 
period  in  the  beam  at  temperatures  above  0°C. 
Consequently,  it  became  possible  to  collect 
diffraction  data  to  the  true  resolution  limit,  and  it 
quickly  became  apparent  that  for  a  number  of 
protein  crystals  this  extends  to  atomic  resolution, 
defined  as  1.2  A,  or  significantly  further.6"8 

The  field  of  ultra-high-resolution  protein  crystal¬ 
lography  is  of  paramount  importance  to  structural 
biology,  even  though  atomic-resolution  protein 
structures  are  these  days  upstaged,  as  judged  by 
the  covers  of  select  scientific  periodicals,  by  low- 
resolution  membrane  protein  structures,  because 
of  the  immediate  biomedical  impact.  One  should 
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not  forget  that  detailed  structural  information 
about  molecules  perceived  as  less  appealing,  such 
as  crambin,9  was  vital  for  the  development  of 
refinement  and  validation  techniques  that  lend 
credibility  to  the  low-resolution  models.10 

In  the  course  of  our  studies  of  the  scaffolding 
protein  syntenin,  we  have  identified  a  crystal  form 
of  the  second  PDZ  domain  of  this  protein,  referred 
to  as  synPDZ2,  which  diffracted  beyond  0.8  A 
resolution  at  the  X8C  beamline  at  NSLS 
(Brookhaven  National  Laboratory,  Upton,  New 
York).  PDZ  domains  are  structurally  conserved 
modules  about  90  amino  acid  residues  in  size.11 
Most  function  by  binding  the  C-terminal  tail  of 
the  target  protein  in  a  structurally  conserved 
groove  between  the  (32  strand  and  the  a2  helix. 
The  terminal  carboxylate  group  of  the  target  is 
anchored  via  hydrogen  bonds  from  three  main 
chain  amide  groups  within  a  conserved  glycine- 
rich  loop  preceding  £2,  a  fingerprint  of  the  PDZ 
fold. 

For  technical  reasons  we  collected  data  to  0.73  A, 
with  completeness  falling  gradually  beyond  0.8  A, 
due  to  tiie  square  shape  of  the  detector  (see 
Materials  and  Methods).  In  spite  of  this  deficiency, 
the  data  still  compare  very  favorably  with  other 
reported  ultra-high-resolution  protein  studies  (see 
Table  1),  consequently  allowing  for  an  unbiased 
evaluation  of  various  aspects  of  crystallographic 
refinement  and  protein  chemistry.  The  final  atomic 
model  was  refined  to  an  R-factor  of  7.5%  (Rfree 
8.7%),  making  it  the  most  precisely  refined  crystal¬ 
lographic  model  of  a  protein  molecule  to  date.  The 
model  conforms  very  well  to  the  expected  stereo¬ 
chemical  parameters,  and  highlights  two  important 


aspects:  the  flexibility  of  the  peptide  bond,  and  the 
distinct  stereochemistry  of  weak,  C-H*  •  O  hydro¬ 
gen  bonds. 


Results 

Improvement  in  crystal  quality 

We  have  reported  the  structure  of  synPDZ2  at 
1.24  A  resolution.12  In  the  present  study,  we  were 
able  to  extend  the  data  to  the  ultra-high-resolution 
range  owing  to  further  improvements  in  the 
quality  of  the  crystals  (Table  2).  The  high-quality 
diffraction  of  these  crystals  is  probably  due,  in 
part,  to  compact  packing,  in  agreement  with  recent 
statistical  analyses.13  The  packing  is  mediated  by 
extensive  interactions  between  neighboring 
molecules  along  the  fr-axis.  The  Matthews  coeffi¬ 
cient  is  1.79  A/Da,  and  the  solvent  content  is 
31.2%  (v/v).  The  unit  cell  volume  of  the  new 
crystals  is  smaller  by  1.7%  compared  to  our 
previous  study.  The  observed  "compression"  is 
primarily  along  the  fc-axis,  and  is  associated  with 
some  subtle  repacking  of  residues  in  the  crystal 
contact  region.  In  addition,  in  the  ultra-high- 
resolution  structure,  we  did  not  observe  oxidation 
of  Cys239,  located  near  the  C-terminal  binding 
motif,  as  seen  in  the  1.24  A  resolution  structure.  It 
is  not  clear  if  this  has  had  an  effect  on  crystal 
quality.  Although  we  used  the  same  crystallization 
conditions  for  both  experiments,  we  cannot  rule 
out  the  possibility  that  the  observed  compressed 
packing  is  due  to  small  differences  in  the  concen¬ 
tration  of  polyethylene  glycol  (PEG)  used. 


Table  1.  The  highest-resolution  protein  structures  in  the  PDB  and  comparison  to  synPDZ2 


Protein 

PDB 

entry 

Residues 

Vm 

(A3/ 

Da) 

Maximum 

resolution 

(A) 

jy  * 

■Emerge 

(shell)*’ 

Completeness 
(outer  shell)* 

Rc 

Crambin 

1EJG 

46 

1.40 

22.4-0.54 

5.5  (14.8) 

97.6  (100) 

9.0 

9.4 

Antifreeze  protein  RD1 

1UCS 

64 

1.40 

22.3-0.62 

7.2  (65.3) 

92.8  (91.8) 

13.3 

15.5 

Subtilisin  (serine  protease) 

1JEA 

269 

2.31 

35.0-0.78 

3.6  (29.0) 

97.3  (92.7) 

9.9 

10.3 

High  potential  iron-sulfur 
protein 

1IUA 

83 

NAd 

20.0-0.80 

5.2  (39.7) 

98.6  (95.5) 

10.1 

11.4 

Trypsin* 

1FY4 

227 

NA 

20.0-0.81 

3.8  (26.9) 

92.8  (88.4) 

10.8 

NA 

Photoactive  yellow  protein 

1NWZ 

125 

1.38 

30.0-0.82 

NA 

97.5  (85.7) 

12.3 

14.4 

Triosephosphate  isomerase 

1N55 

251 

1.77 

25.0-0.83 

2.9  (39.4) 

99.3  (97.2) 

NA 

10.8 

fi-Lactamase  Tem-1 

1M40 

263 

1.69 

15.0-0.85 

5.2  (42.0) 

3.0  (30.2)* 

100  (100) 

9.1 

11.2 

Acutohaemolysin 

1MC2 

122 

1.45 

10.0-0.85 

79.9  (30.0) 

9.5 

12.1 

Trypsin  inhibitor 

1G6X 

58 

2.28 

10.0-0.86 

3.6  (48.8)* 

94.9  (8.6) 

10.7 

14.0 

Xylose  isomerase 

1MUW 

386 

1.97 

50.0-0.86 

5.1  (56.5) 

6.0  (38.7)* 

97.4  (92.0) 

12.5 

14.3 

Syntenin  PDZ2 

1R6J 

82 

1.79 

20.0-0.73 

83.9  (9.6) 

7.5 

8.7 

The  Table  includes  protein  structures  with  higher  than  0.86  A  resolution  except  peptide  structures,  a-conotoxin  Si  (14  residues) 
designed  peptide  a-1  (26  residues)  and  gramicidin  D  (36  residues). 
a  Emerge  =  ]£I/|  -  where  h  is  the  intensity  of  the  ith  observation  and  (I)  is  the  mean  intensity  of  the  reflections. 

b  The  numbers  in  parentheses  describe  the  relevant  value  for  the  highest-resolution  shell. 

c  R  —  £llFobsl  -  lFCaicll/X)lPobsl>  crystallographic  R  factor,  and  Rfr(X  —  X)llFobsl  -  iFcaicll/XIlFobsl  when  all  reflections  belong  to  a  test 
set  of  randomly  selected  data. 
d  Not  available. 

e  The  structure  of  lowest  R  factor  among  four  0.81  A  resolution  trypsin  structures. 

f  Due  to  incompleteness  of  high-resolution  data,  the  nominal  resolution  of  these  studies  should  be  somewhat  lower  than  the 
maximum  resolution. 
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Table  2 ,  Crystallographic  data 

A.  Experimental  data 

Wavelength  (A) 

0.8500 

Space  group 

P2i 

Unit  cell  parameters 
a  (A) 

25.88 

b  (A) 

39.54 

c(A) 

32.28 

P  (deg.) 

109.6 

Resolution  (A) 

20.0-0.73 

Total  reflections 

(0.76-0.73)' 

316,159 

Unique  reflections 

71,347  (813) 

Completeness  (%) 

83.9  (9.6) 

7Wb  (%) 

6.0  (38.7) 

Average  I/cr(7) 

25.3  (1.7) 

Wilson  B-factor  (A2) 

3.75 

B.  Refinement'details 

RefmacS 

Shelx-97 

Resolution  (A) 

16.6-1.0 

10.0-0.73 

Reflections  (working) 

33,177 

69,900 

Reflections  (test) 

655 

1,389 

Rwork* 

All  data  (%) 

7.89 

7.45 

F0  >  4<t  (%) 

- 

6.62 

Rfree* 

All  (%) 

9.63 

8.66 

F0  >  4cr  (%) 

- 

7.79 

Number  of  water  molecules 

209 

237 

R.M.S.  deviation  from  ideal  geometry 

Bonds  (A) 

0.012 

0.019 

Angles  (deg.) 

1.724 

- 

Angle  bonded  distances  (A) 

- 

0.033 

Average  B-factor 

Main  chain  (A2) 

3.32 

4.70 

Side-chain  (A2) 

4.53 

6.00 

Water  (A2) 

10.73 

10.82 

*  The  numbers  in  parentheses  describe  the  relevant  value  for 
the  last  resolution  shell. 

b  Emerge  =  ~  W/J21  where  li  is  the  intensity  of  the  ith 

observation  and  (7)  is  the  mean  intensity  of  the  reflections. 

c  Rwork  =  £HFobsl  -  iFca]cll/ElfobSl,  cry  stall  ographic  R  factor, 
and  R[ rcc  =  X^lFobsl  -  lFcaicll/£lFobsl  when  all  reflections  belong 
to  a  test  set  of  randomly  selected  data. 


Precision  of  the  model  refinement 

Due  to  the  square  shape  of  the  detector  and  the 
data  collection  geometry,  we  were  able  to  record  a 
nearly  complete  dataset  to  0.8  A  resolution,  and  a 
progressively  decreasing  fraction  of  data  to  0.73  A 
(Table  2).  The  alternative  would  have  been  to  use 
20  geometry,  which  was  technically  not  possible  at 
the  time.  Although  only  56%  complete,  the  data 
beyond  0.8  A  contain  7413  useful  reflections  (10% 
of  the  recorded  data)  and  we  decided  to  use  all 
those  data  in  subsequent  calculations.  An  atomic 
model  of  synPDZ2  with  isotropic  displacement 
parameters,  but  without  hydrogen  atoms,  was 
refined  to  a  value  for  Rwor u  of  14.8%  and  of 
15.7%.  Anisotropic  refinement  decreased  the  Rwork 
to  10.8%,  and  Rfree  to  12.5%.  This  is  in  agreement 
with  the  previously  reported  observations  that 
anisotropic  displacement  parameters  typically 
account  for  4-5%  of  the  R-factor.  Adding  hydrogen 
atoms  at  this  stage  reduced  the  R-factor  to  9.61% 
and  Rfree  to  10.98%. 


At  this  point  in  the  refinement,  we  discovered 
that  the  N  terminus  and  the  pl-p2  loop  have 
minor  secondary  conformations.  The  adjustment 
of  the  model  to  include  this  disorder  had  a 
significant  impact  on  the  agreement  factors 
(Rwork  =  8.30%,  Rfree  —  9.96%).  Further  refinement 
of  the  solvent  structure  and  adjusting  occupancies 
of  the  disordered  parts  and  water  molecules 
resulted  in  a  drop  of  RWOrk  to  7.53%,  and  Rfree  to 
8.71%.  The  only  remaining  option  was  to  introduce 
limited  freedom  for  hydrogen  atoms,  and  this 
proved  to  be  marginally  helpful,  bringing  Rwork  to 
7.47%  and  Rf^e  to  8.66%.  At  this  point,  we  have 
discontinued  using  the  Rfree  and  we  used  all  the 
data  for  the  final  phase  of  calculations. 
Unrestrained  refinement  had  no  noticeable  effect 
on  R  (7.45%),  but  restraints  for  minor  alternative 
conformations  had  to  be  in  place,  as  without  these 
restraints  the  atomic  coordinates  of  atoms  with 
low  occupancy  shifted  out  of  the  density.  A  full- 
matrix  refinement  resulted  in  no  change  to  atomic 
coordinates,  the  R-factor  or  the  electron  density 
map. 

The  R-values  for  the  synPDZ2  model  are  the 
lowest  for  any  protein  structure  recorded  to  date 
(Table  1).  With  the  exception  of  crambin  at  0.53  A 
resolution,  no  prior  study  ever  succeeded  in  bring¬ 
ing  the  Rfree  below  10%,  while  our  model  shows 
8.7%.  The  RWork  value  is  several  percentage  points 
lower  than  representative  structures  reported  in 
the  0.78-0.85  A  resolution  range. 


Multiple  main  chain  and  side-chain 
conformations 

The  N  terminus  of  synPDZ2,  Glyl92  to  Thrl98 
using  the  numbering  of  full-length  syntenin, 
shows  distinct  static  disorder  (Figure  1).  The 
occupancy  of  the  minor  conformer  is  0.18,  and  the 
electron  density  for  the  respective  atoms  is  only  a 
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Figure  1.  Electron  density  maps  around  the  disordered 
N  terminus  of  synPDZ2.  The  2 F0  -  Fc  electron  density 
map  (blue)  and  F0  ~  Fc  difference  electron  density  map 
(red)  from  the  structure  with  major  conformer  alone  are 
contoured  at  +4.0  <r.  Major  and  minor  conformers  of 
residues  Metl94  to  Argl97  are  shown  as  thick  and  thin 
frame,  respectively.  The  Figures  were  generated  using 
O.20 
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little  higher  than  that  of  hydrogen  atoms  of  main 
chain  amide  groups.  The  N-terminal  amino  group 
of  the  major  conformer  donates  a  hydrogen  bond 
to  the  carbonyl  oxygen  atom  of  Arg229  of  the 
neighboring  molecule,  while  the  carbonyl  group  of 
Glyl92  in  the  minor  conformer  accepts  a  hydrogen 
bond  from  the  amide  group  of  Ser252  of  another 
partner  molecule  across  crystal  contacts.  Another 
apparent  difference  is  the  presence  of  a  chloride  ion 
near  the  N  terminus  of  the  major  conformer.  It  is 
possible  that  the  conformational  heterogeneity  is 
related  to  the  partial  occupancy  of  the  chloride  ion. 

Another  fragment  of  the  structure  with  two 
alternative  main-chain  conformations  is  the  pl-p2 
loop,  i.e.  Asp204  to  Gly207.  Although  the  ratio  of 
occupancies  of  the  main  to  minor  conformers 
(0.81-0.19)  is  similar  to  what  is  observed  at  the  N 
terminus,  it  is  unlikely  that  the  two  are  inter¬ 
dependent,  since  there  is  no  direct  contact  between 
them.  This  loop  does  not  interact  with  any  of  the 
neighboring  molecules,  and  the  disorder  is  more 
likely  to  be  associated  with  the  heterogeneity  of 
the  hydrogen  bonds  within  the  loop.  Hydrogen 
bonds  between  O'*1  of  Thr206  and  O81  of  Asp204 
(2.67  A),  and  N81  of  His208  and  O 71  of  Thr206 


(2.70  A),  are  found  in  the  major  conformation  but 
not  in  the  minor  conformation.  Interestingly,  the 
position  of  the  loop  in  the  previously  reported 
PDZ2  structure  is  between  the  two  high-resolution 
conformations,  and  the  electron  density  in  this 
region  is  not  clear  in  other  structures  of  syntenin 
either.  It  is  possible  that  ultra-high-resolution  data 
were  able  to  resolve  static  disorder  that  was 
unidentifiable  earlier. 

There  are  alternative  conformations  for  the  side- 
chains  of  Aspl95,  Ser205  (both  within  the  major 
main  chain  conformer),  His202,  Lys214,  Ile218, 
Glu235,  His236,  Glu240,  Asn245,  Ser259,  Thr268, 
and  Met270.  A  higher  number  of  multiple 
occupancies  is  expected  with  improvement  in 
resolution.14 

Overall,  when  the  model  is  compared  to  the 
1.24  A  resolution  synPDZ2  structure,  it  shows  an 
r.m.s.  o  difference  for  the  main-chain  atoms  of 
0.745  A  for  all  residues,  but  the  value  is  only 
0.364  A  when  the  N-terminal  five  residues  and 
C-terminal  two  residues  are  excluded.  These 
residues  participate  in  the  stacking  along  the 
b- axis,  where  the  unit  cell  compression  resulted  in 
packing  rearrangements. 


Table  3.  Analysis  of  main  chain  bond  lengths  and  angles 


Mean  value 


Residues 

Min  value 

Max  value 

Mean  value 

(small  molecule  data)" 

A.  Bond 

C-N 

Except  Pro 

1.304 

1.358 

1.333  (0.009)" 

1.329  (0.014) 

Pro 

1.325 

1.341 

1.333  (0.008) 

1.341  (0.016) 

C-O 

1.215 

1.267 

1.235  (0.010) 

1.231  (0.020) 

C“-C 

Except  Gly 

1.492 

1.544 

1.525  (0.010) 

1.525  (0.021) 

Gly 

1.508 

1.543 

1.520  (0.011) 

1.516  (0.018) 

Ca-Cp 

Ala 

1.510 

1.526 

1.519  (0.007) 

1.521  (0.033) 

He,  Thr,  Val 

1.520 

1.587 

1.540  (0.013) 

1.540  (0.027) 

The  rest 

1.506 

1.563 

1.532  (0.013) 

1.530  (0.020) 

N-Ca 

Except  Gly,  Pro 

1.434 

1.466 

1.435  (0.008) 

1.458  (0.019) 

Gly 

1.432 

1.456 

1.442  (0.008) 

1.451  (0.016) 

Pro 

1.462 

1.467 

1.465  (0.002) 

1.466  (0.015) 

B.  Angle 

Ca-C-N 

Except  Gly,  Pro 

112.73 

120.45 

116.34  (1.53) 

116.2  (2.0) 

Gly 

113.73 

119.79 

117.63  (1.91) 

116.4  (2.1) 

Pro 

115.07 

116.52 

115.79  (0.73) 

116.9  (1.5) 

C-N-C“ 

Except  Pro 

120.38 

125.42 

122.87  (1.03) 

123.0  (1.6) 

Pro 

122.28 

123.55 

122.91  (0.64) 

122.0  (1.4) 

c-N-ca 

Except  Gly,  Pro 

118.24 

126.41 

121.72  (1.60) 

121.7  (1.8) 

Gly 

120.09 

121.94 

121.06  (0.62) 

120.6  (1.7) 

Pro 

119.49 

119.51 

119.50  (0.01) 

122.6  (5.0) 

C“-C-0 

Except  Gly 

116.84 

123.42 

120.72  (1.30) 

120.8  (1.7) 

Gly 

116.89 

123.20 

119.32  (1.74) 

120.8  (2.1) 

Cp-C“-C 

Ala 

108.83 

111.15 

109.97  (0.94) 

110.5  (1.5) 

He,  Thr,  Val 

103.06 

113.80 

110.89  (2.04) 

109.1  (2.2) 

The  rest 

106.43 

113.84 

110.24  (1.43) 

110.1  (1.9) 

N-C“-C 

Except  Gly,  Pro 

106.80 

114.11 

110.44  (1.89) 

111.2  (2.8) 

Gly 

109.62 

118.19 

114.60  (2.29) 

112.5  (2.9) 

Pro 

110.05 

110.17 

110.11  (0.06) 

111.8  (2.5) 

N-C“-Cp 

Ala 

109.74 

110.75 

110.21  (0.39) 

110.4  (1.5) 

He,  Thr,  Val 

109.01 

113.50 

111.14  (1.26) 

111.5  (1.7) 

Pro 

102.87 

103.29 

103.08  (0.21) 

103.0  (1.1) 

The  rest 

107.31 

114.44 

110.74  (1.48) 

110.5  (1.7) 

a  The  small  molecule  data  used  in  the  above  analysis  from  Ref.  12. 
b  Standard  deviation  in  parentheses. 
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Secondary  structure  and  the  planarity  of  the 
peptide  bond 

According  to  an  analysis  by  PROCHECK,15  93% 
of  residues  other  than  glycine  and  proline  are  in 
the  most  favored  regions  of  the  Ramachandran 
plot,  while  7%  of  residues  are  in  additionally 
allowed  regions.  In  addition,  seven  glycine  and 
two  proline  residues  are  in  the  favorable  regions. 
Main-chain  bond  lengths  and  bond  angles  within 
the  major  conformer  are  within  normal  standard 
deviations  (0.011  A  for  bond  lengths  and  1.625°  for 
bond  angles).  The  r.m.s.  deviations  of  main-chain 
bond  length  and  angle  in  the  alternative  N  termi¬ 
nus  and  the  pi-p2  loop  are  0.016  A  and  2.844°, 
respectively.  Although  we  kept  the  restraints  for 
the  minor  conformer  in  place  throughout  the 
refinement,  these  values  are  higher  than  those 
associated  with  the  major  conformer,  reflecting 
greater  uncertainty  due  to  lower  occupancy.  Mean 
values  for  main-chain  bond  lengths  and  bond 
angles  in  the  major  conformer,  which  were  refined 
without  any  restraints,  are  comparable  with  those 
observed  for  small  molecules16  (Table  3). 

The  mean  to  angle  in  the  refined  model  is  178.4°, 
lower  than  an  ideal  value  of  180°,  or  the  mean 
value  179.0°  from  other  protein  crystallographic 
studies,17,18  but  in  agreement  with  several  atomic- 
resolution  structure  determinations,  such  as  that 
of  a  ribonuclease  from  Streptomyces  aureofaciens 19 
(Figure  2).  Eight  peptides  deviate  more  from  the 
mean  than  others.  Ser261,  located  the  end  of  a2 
helix,  has  the  lowest  <o  value  (162.2°),  corroborated 
unequivocally  by  the  electron  density.  Asn230, 
(<o  =  169.4°)  is  also  located  at  the  end  of  a  helix 
(al).  Two  adjacent  leucine  residues,  Leu232  and 
Leu233,  located  in  the  short  loop  between  the  al 
helix  and  (34  strand  have  w  angles  of  189.9°  and 
165.7°,  respectively.  Two  high  to  angles  were 


observed  for  glycine  residues,  Gly207  (191.3°)  and 
Gly216  (192.5°)  in  the  turn  regions.  Another  two 
residues  in  the  minor  conformation,  Thrl98  at  the 
end  of  N-terminal  part  and  Asp204  in  the  begin¬ 
ning  of  pi-p2  loop  have  high  <o  values  (194.8° 
and  191.2°).  However,  due  to  low  occupancy  of 
the  minor  conformer,  these  values  are  not  as 
reliable. 

Side-chain  conformers 

MacArthur  and  Thornton14  found  a  systematic 
variation  with  resolution  of  the  mean  values  of  the 
Xl  rotamers.  We  compared  the  values  in  our  ultra- 
high-resolution  structure  to  their  extrapolation  of 
regression  line  to  ultra-high-resolution.  For  major 
conformation,  the  mean  values  of  gauche  trans, 
and  gauche+  x 1  rotamers  are  67.3°  (6.0°),  187.5° 
(8.6°)  and  -65.0°  (7.3°),  respectively.  Both  gauche~ 
and  gauche+  value  are  close  to  the  value  expected 
in  high  resolution  (65.6°  and  -  65.4°),  while  that  of 
the  trans  rotamer  deviates  from  the  expected  value 
of  181.6°.  We  analyzed  xl  rotamers  of  serine  and 
leucine  residues,  which  show  highly  significant 
correlations  of  mean  value  to  the  resolution.  There 
are  seven  serine  residues  in  the  structure.  Three 
residues  have  trans  rotamer  and  two  residues 
have  gauche~  rotamer,  while  other  two  residues 
have  dual  or  triple  occupancies.  This  is  in  good 
agreement  with  the  observation  that  there  is  a 
higher  proportion  of  gauche  ~  rotamer  than  that  of 
gauche +  rotamer  in  high-resolution  structures. 
However,  the  values  of  gauche ~  or  gauche+  do  not 
follow  the  trend  of  resolution  dependency, 
although  admittedly  the  sample  is  too  small  to 
draw  generalized  conclusions.  The  mean  value  of 
three  gauche  ~  rotamers  is  62.6°  and  it  is  lower  than 
the  mean  value  of  high-resolution  structures 
(66.3°).  One  rotamer  of  Ser259  has  higher  gauche + 
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Figure  2.  The  peptide  bond  distorsions.  A,  Ser261  with  the  lowest  value  (162.2°)  of  to  angle  in  the  structure:  the  2F0  — 
Fc  electron  density  map  is  contoured  at  +1.0  cr  (gray)  arid  +4.0  tr  (red).  B,  Histogram  of  co  angles  in  the  refined  model. 
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Figure  3.  Distorted  planarity  of  Phe273.  The  bending 
between  Cp  and  the  benzene  ring  of  Phe273  is  indicated 
by  an  arrow. 


value  (-54.2°),  while  the  mean  value  of  that  in 
high-resolution  structures  is  -66.8°.  Ser205  in 
major  conformation  has  all  three  xl  rotamers  as 
alternative  side-chain  positions.  Values  of  gauche ”, 
trans  and  gauche +  rotamers  are  60.5°,  176.6°  and 
-  57.3°,  respectively.  Three  leucine  residues  among 
four  in  the  structure  have  gauche +  rotamers  with  a 
mean  value  of  61.5°.  This  is  in  good  agreement 
with  the  mean  value  of  higher  gauche+  in  high- 
resolution  structures. 

There  is  one  case  of  apparent  distorted  geometry 
for  a  side-chain,  i.e.  Phe273.  The  r.m.s.  deviation  of 
side-chain  atoms  from  planarity  is  0.056  A,  while 
side-chains  of  other  residues  (Arg,  Gin,  Phe,  Asn, 
Asp,  Glu  and  His)  show  normal  planarity.  The 
phenyl  ring  of  Phe273,  the  C-terminal  residue  in 
this  construct,  binds  in  the  peptide  binding  pocket 
of  the  adjacent  molecule,  mimicking  a  canonical 
PDZ  interaction.12  The  binding  pocket  includes  the 
glycine-rich  carboxylate-binding  loop,  while  the 
side-chain  is  embedded  within  a  hydrophobic 
cluster  of  Phe211,  Phe213,  Ala255  and  Leu258. 
The  side-chain  of  Phe273  bends  to  fit  tightly  into 
the  pocket,  so  that  Cp  is  significantly  out  of  the 
plane  of  the  phenyl  ring  (Figure  3).  Several 
atoms  of  the  Phe273  phenyl  ring  make  close  con¬ 
tacts  ( d  <  4.0  A)  with  Leu258,  Phe213  and  Leu258, 
but  the  closest  contact  iso  between  C£  of  Phe273 
with  Cp  of  Ala255  (3.49  A).  The  latter  contact  is 
probably  the  key  stereochemical  reason  for  the 
bending  of  the  benzene  ring.  Possible  xl  rotation 
to  relieve  this  strain  is  blocked  by  closely  located 
Ser259  and  Metl94. 

There  are  eight  free  carboxyl  groups  in  the  struc¬ 
ture  (aspartate,  glutamate  and  C  terminus).  As 
judged  by  the  bond  lengths,  all  are  ionized  with 
mean  C-O  bond  of  1.254  A,  ranging  from  1.227  A 
and  to  1.270  A. 

Hydrogen  atoms 

Most  hydrogen  atoms  of  the  model  are  visible  in 
the  electron  density  map  and  the  inclusion  of 
hydrogen  atoms  in  the  model  reduces  the  R-factor 
by  1.6%.  We  introduced  riding  hydrogen  atoms 
only  into  the  major  conformer.  The  positions  of 


hydrogen  atoms  on  the  rigid  group  (Y-X-H)  are 
refined  with  free  rotation  about  the  Y-X  bond. 
However,  most  of  the  hydrogen  atoms  in  methyl 
groups  lock  into  unique  positions  and  there  is 
only  a  limited  level  of  rotation.  Hydrogen  atoms 
in  CH,  NH  and  aliphatic  CH2  groups  show  good 
electron  density  (Figure  4).  However,  electron 
density  of  amide  hydrogen  in  some  cases  is  not 
clear.  Such  amide  groups  are  typically  involved  in 
the  hydrogen  bond  with  an  ionized  group.  The 
amide  groups  of  Gly210  and  Phe211  have 
hydrogen  bonds  with  the  C-terminal  carboxyl 
group  of  Phe273,  and  those  of  Thr206  and  His208 
interact  with  the  carboxyl  group  of  Asp204.  The 
amide  group  of  Alal93  has  a  hydrogen  bond  with 
the  chloride  ion  in  the  major  conformation  of  the 
N  terminus.  The  low  electron  density  of  these 
protons  may  be  due  to  the  strong  character  of 
hydrogen  bonds  and  delocalized  hydrogen  atoms. 

Anisotropic  displacement  parameters 

The  anisotropic  displacement  parameters 
(B-factors)  were  used  for  all  non-hydrogen  atoms, 
including  the  water  oxygen  atoms.  The  tempera¬ 
ture  factors  of  hydrogen  atoms  were  assigned  a 
value  1.2  times  greater  (1.5  times  for  methyl 
group)  than  that  of  the  heavy  atom  bound  to  it. 
Introduction  of  the  anisotropic  model  resulted  in  a 
significant  drop  (about  4%)  in  both  RWOrk  and 
Rfree*  The  mean  anisotropy  factor  for  all  protein 
atoms  in  the  major  conformer  is  0.58,  0.60  for 
main  chain  and  0.55  for  side-chains.  The  corre¬ 
sponding  average  isotropic  B-value  for  protein 
atoms  is  6.7  A2  (4.7  A2  for  main  and  7.9  A2  for 
side-chain).  The  mean  anisotropy  factors  for  the 
main  chain  in  flexible  parts  are  lower  than  average 
(0.52  for  the  N  terminus  and  0.43  for  the  (31 -(32 
loop).  Many  side-chain  atoms  in  minor  confor¬ 
mations  have  large  anisotropy  (axis  ratio  is  more 
than  1:5)  and  some  of  them  are  extreme  (more 
than  1:10),  probably  reflecting  genuine  dynamic 
disorder.  Our  ultra-high-resolution  structure  has 
low  anisotropy  compared  to  the  average  aniso¬ 
tropy  factor  of  0.45  for  other  anisotropically  refined 
structures.20 

Classic  hydrogen  bonds 

There  are  187  possible  hydrogen  bonds,  exclud¬ 
ing  those  between  water  molecules  in  the  structure. 
These  bonds  include  the  interactions  where  the 
donor -acceptor  distances  are  less  than  the  radius 
of  acceptor  plus  2  A,  and  where  the  D-H-A  angle 
(where  D  is  donor  and  A  is  acceptor)  is  larger 
than  110°.  The  mean  bond  length  and  angle  in  this 
group  are  2.94  A  and  154.4°,  respectively.  With 
nitrogen  as  donor,  those  are  2.95  A  and  155.6°, 
while  with  oxygen  as  donor,  those  are  2.85  A  and 
141.8°.  Oxygen  is  a  donor  in  seven  out  of  14  cases 
of  short  bond  length  below  2.7  A  compared  to  30 
out  of  187  total  hydrogen  bonds.  Short  hydrogen 
bonds  are  seen  rarely  in  synPDZ2,  and  there 
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Figure  4.  Examples  of  electron  density  maps  revealing  the  position  of  hydrogen  atoms.  The  2 F0  -  Fc  electron  density 
map  (gray)  is  contoured  at  +1.0  a,  while  the  Fc  —  Fc  difference  electron  density  map  (red)  prior  to  inclusion  of 
hydrogen  atoms  is  contoured  at  +3.0  a  and  superimposed.  A,  Ile218;  B,  Lys203;  C,  Leu258;  and  D,  Ile247. 


are  only  three  shorter  than  2.6  A.  None  exhibit 
the  characteristics  of  the  so-called  low-barrier 
H-bonds.10 

Weak  (CH-  •  *0)  hydrogen  bonds 

We  found  several  good  examples  of  CH-O 
hydrogen  bonds  stabilizing  the  PDZ-fold.  These 
weak  cohesive  interactions  are  seen  specifically 
between  0-strands,  in  accord  with  previous 
studies.21,22  One  of  them  is  between  Asn237  in  the 
04  strand  and  Ile269  in  the  06  strand:  the  carbonyl 
oxygen  atom  of  Asn237  accepts  hydrogen  bonds 
from  C“  of  Ile269  and  from  the  amide  group  of 
Met270  (Figure  5).  The  distance  between  hydrogen 
and  oxygen  in  the  CH-O  bond  is  closer  (2.31  A) 
than  that  in  NH  bond  (2.36  A),  while  distances 
between  the  nitrogen  atom  of  the  amide  group  to 
C“  and  to  the  carbonyl  oxygen  atom  are  virtually 
identical  (3.16  A  and  3.17  A,  respectively).  Given 
the  high  precision  of  atomic  coordinates  in  our 


Figure  5.  The  CH-O  hydrogen  bond  between  anti¬ 
parallel  0-strands.  The  2 Fc  -  Fc  electron  density  map 
(gray)  is  contoured  at  lo-  and  the  F0  —  Fc  difference 
electron  density  map  (red)  of  the  hydrogen-free  model 
is  contoured  at  +4.0  o*  and  superimposed. 
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model,  this  is  a  dramatic  illustration  of  the  stereo¬ 
chemistry  of  CH-  •  O  bonds  in  (3-sheets. 

Another  class  of  weak  hydrogen  bonds  in 
proteins  involves  Tr-acceptors.  It  includes  NH-tt 
or  OH-tt  and  weaker  CH-tt  interactions.  The  t r 
acceptors  are  typically  aromatic  rings.  However, 
XH-tt  hydrogen  bonds  can  be  formed  with  many 
acceptors  other  than  phenyl  rings,  such  as  various 
heterocycles,  C=C,  C=C,  and  other  ir  bonded 
moieties.24  In  synPDZ2,  We  could  not  find  weak 
hydrogen  bonds  other  than  the  CH-O  bonds,  in 
spite  of  extensive  search  using  the  NCI  server.25 

Solvent  structure 

We  identified  237  water  molecules  including 
partially  occupied  molecules.  A  total  of  79  water 
molecules  are  less  than  half-occupied  and  so  the 
total  sum  of  occupancy  for  water  molecules  is 
150.2.  This  is  equivalent  to  300  water  molecules 
per  unit  cell.  Given  the  solvent  content  of  31.2% 
(v/v),  about  323  water  molecules  are  possible  in  a 
unit  cell,  and  so  we  identified  more  than  90%  of 
possible  water  molecules  in  the  crystal.  We 
also  find  an  additional  chloride  atom.  It  is 
coordinated  by  the  amidoe  groups  of  Alal93 
(3.26  A)  and  Ser252  (3.43  A)  of  the  neighboring 
molecule  (x,  y  +  1/2,  z). 

Most  water  molecules  are  well  ordered  with 
hydrogen  bonds  to  the  protein  and  to  other  water 
molecules.  The  average  B-factor  of  water  molecules 
is  10.8  A2  and,  as  expected,  the  distribution  of 
B-values  depends  on  the  hydrogen  bonding 
pattern.  The  water  molecules  with  four  hydrogen 
bonds  have  an  average  B  of  5.6  A2,  those  with 
three,  two  and  one  hydrogen  bonds,  show  B-values 
of  6.7  A2,  7.1  A2  and  8.1  A2,  respectively.  The 
average  B-value  of  water  molecules  beyond  the 
first  water  shell  is  12.8  A2.  Most  water  molecules 
with  high  occupancy  have  good  density  for  hydro¬ 
gen  atoms  involved  in  hydrogen  bonds  (Figure  6). 


Figure  6.  Electron  density  for  well-ordered  solvent. 
The  2 F0  -  Fc  electron  density  map  (gray)  is  contoured  at 
lcr  and  the  F0  -  Fc  difference  electron  density  map  (red) 
is  contoured  at  +3.5  or  and  superimposed. 


There  are  several  examples  of  pentagonal  rings  of 
water  molecules  involving  an  oxygen  atom  from 
the  protein. 

To  compare  the  positions  of  water  molecules  to 
those  from  the  previous,  lower-resolution  study 
(PDB  entry  1NTE),  we  superimposed  Ca  residues 
from  Thrl98  to  Met270  of  the  two  models.  Among 
the  water  molecules  within  4  A  of  any  of  the  168 
water  molecules  in  the  previous  structure,  97  are 
located  within  1.2  A  from  a  water  molecule  in  the 
low-resolution  model,  and  51  are  within  0.73  A. 
Given  the  precision  of  the  atomic  coordinates  in 
both  models,  this  indicates  a  substantial  variation 
in  the  solvent  structure. 


Discussion 

The  ultra-high-resolution  structure  of  synPDZ2 
narrows  the  classical  discrepancy  between  small 
molecule  and  macromolecular  crystallography.  For 
decades,  the  best  protein  crystal  structures  were 
refined  to  an  B-factor  in  the  range  0.15-0.20,  while 
small  molecules  are  routinely  refined  to  0.02-0.03. 
This  raised  the  question  of  what  precisely  is  the 
cause  of  the  "B~gap".  The  synPDZ2  structure, 
refined  to  an  B-factor  of  7.45%,  comes  near  to 
closing  the  gap,  and  it  is  instructive  to  reflect  why 
this  was  possible. 

The  introduction  of  an  anisotropic  vibrational 
model  and  inclusion  of  hydrogen  atoms  seem  to 
account  for  no  more  than  the  usually  observed 
difference  of  0.05-0.07  in  the  B-factor.  We  note, 
however,  that  the  inclusion  of  the  minor  main- 
chain  conformers  in  two  separate  fragments  made 
a  critical  difference,  even  though  some  occupancies 
are  relatively  low.  When  only  the  main  conformer 
is  included,  the  B-factor  value  is  9.20%.  In  compari¬ 
son,  elimination  of  hydrogen  atoms  from  the 
model  increases  the  B-factor  to  9.07%,  and  substi¬ 
tution  of  isotropic  displacement  parameters  yields 
an  B-factor  of  12.2%. 

The  idea  of  using  a  limited  ensemble  of 
structures  for  crystallographic  refinement  is  not 
new,  and  it  was  originally  proposed  by  Kuriyan 
and  colleagues,  who  observed  a  lower  B-factor 
when  two  structures  were  used  instead  of  one.26,27 
The  result  was  initially  open  to  the  criticism  that 
introduction  of  more  parameters  results  in  over¬ 
fitting  but,  as  cross-validation  was  not  yet 
introduced  into  crystallographic  procedures,  the 
method  was  never  evaluated  properly.  It  might  be 
instructive  to  revisit  this  approach,  particularly  in 
the  case  of  ultra-high-resolution  structures. 

The  highly  refined  model  of  synPDZ2  shows 
canonical  stereochemistry  and  conforms  to  the 
existing  libraries  of  geometric  restraints.  Two 
observations  are  noteworthy:  the  deviations  from 
planarity  of  the  peptide  unit;  and  the  stereo¬ 
chemical  evidence  for  C-H-  •  O  hydrogen  bonds. 

The  concept  of  a  planar  peptide  bond  played  a 
key  role  in  the  history  of  protein  science.  It  is 
generally  attributed  to  Linus  Pauling,  who  may 
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have  had  a  good  understanding  of  the  resonance 
effects  very  early  on,  and  who  appreciated  the 
importance  of  the  structure  of  diketopiperazine, 
which  showed  a  flat  peptide  unit.28  He  explicitly 
put  the  general  notion  in  writing  in  1939,  in  a 
famous  paper  co-authored  with  C.  Niemann, 
demolishing  the  cyclol  theory  of  protein  structure 
formulated  by  the  British  mathematician  Dorothy 
Wrinch.29  It  is  less  well  appreciated  that  another 
chemist  and  an  early  pioneer  of  hydrogen  bond 
research,  Maurice  Huggins,  published  his 
thoughts  on  the  resonance  effects  in  the  peptide 
bond  in  1937,  and  showed  that  this  explained  the 
co-planarity  of  the  amide  hydrogen  with  the 
peptide  unit.30  Interestingly,  the  concept  of  the 
planar  peptide  unit  escaped  the  attention  of  many 
scientists,  and  as  late  as  in  1950  Bragg  et  al  pub¬ 
lished  a  model  of  a  helix  in  which  the  peptides 
were  rotated  around  the  C-N  bond.31  Pauling 
used  the  planar  peptide  to  predict  successfully  the 
a-helix,32  and  ever  since  the  rigid  planarity  of  the 
peptide  bond  has  featured  prominently  in  all  bio¬ 
chemistry  and  chemistry  textbooks.  In  reality, 
high-resolution  studies  show  that  the  bond  is 
relatively  flexible  allowing  for  deviations  of  up  to 
20°  from  planarity.18  This  is  visible  particularly 
well  in  the  synPDZ2  structure,  which  shows 
several  significantly  twisted  peptides  with  very 
well  resolved  electron  density.  This  is  of  signifi¬ 
cance  for  proper  application  of  restraints  in  crystal¬ 
lographic  refinement. 

With  respect  to  C-H  •  O  bonds,  it  is  important 
to  realize  that  chemists  have  long  accepted  their 
existence.  As  with  the  peptide  bond,  the  original 
observation  is  traced  to  Pauling,  who  attributed 
the  high  boiling  temperature  of  acetylchloride  to  a 
C-H---0  bond.33  The  notion  that  these  bonds  are 
ubiquitous  in  proteins  was  put  forward  by 
Derewenda  et  al .  1/34  and  in  recent  years  gained  sig¬ 
nificant  support 35  Stereochemical  arguments  show 
that  C-H-  •  O  bonds  may  play  a  particularly  sig¬ 
nificant  role  in  (3-sheets,  by  providing  a  stabilizing 
interaction  mediated  by  C“  protons,  and  the  free 
orbitals  on  the  opposing  carbonyl  oxygen  atoms.22 
These  interactions,  saturating  the  H-bonding 
potential  of  the  internal  carbonyl  group,  account 
for  the  majority  of  the  so-called  lost  hydrogen 
bonds  (LHBs)  in  the  core  of  protein  molecules.  In 
spite  of  the  increased  acceptance  of  the  notion  of 
weak  H-bonds  mediated  by  C-H  donor  groups, 
there  have  been  occasional  doubts  raised  if  the 
favorable  interactions  are  not  due  to  artifacts  of 
crystallographic  refinement.  The  synPDZ2  struc¬ 
ture,  refined  with  no  restraints,  provides  beautiful 
examples  of  stereochemistries  consistent  with  the 
C-H-  •  -O  cohesive  interactions. 


Materials  and  Methods 

Crystallization  and  data  collection 

The  synPDZ2  domain  (197-273)  from  human  syntenin 


was  overexpressed  as  a  GST-fusion  form  and  purified  as 
described.12  Purified  synPDZ2  contains  an  additional 
pentapeptide  (GAMDP)  at  the  N  terminus  due  to  the 
cloning  procedure  and  is  designated  in  the  model  as 
residues  192-196.  Crystals  were  obtained  in  0.1  M 
Hepes  (pH  7.0)  and  34%  PEG  4000  using  the  sitting- 
drop,  vapor-diffusion  method  with  microseeding  and 
were  frozen  in  liquid  nitrogen.  Data  were  collected  at 
beamline  X8C  at  NSLS  with  a  wavelength  of  0.85  A 
using  an  ADSC  Quantum  4R  CCD.  Two  data  sets  were 
collected.  For  high  resolution,  the  detector  was  set  at  a 
distance  of  45  mm  and  the  crystal  was  exposed  for 
40  seconds  with  0.5°  of  oscillation.  For  low  resolution, 
the  exposure  time  was  ten  seconds  with  1.0°  of 
oscillation  at  a  detector  distance  of  90  mm. 

Data  sets  were  processed  and  scaled  using  HKL2000.37 
The  space  group  is  P2X  with  unit  cell  parameters  of 
a  =  25.88  A,  b  =  39.54  A,  c  =  32.28  A,  p  =  109.64°.  The 
completeness  of  data  in  the  outer  shell  decreased 
dramatically  due  to  the  shape  of  the  detector.  At  the 
resolution  range  of  20  A  to  0.73  A,  the  total  number  of 
observations  was  316,159  and  number  of  unique  reflec¬ 
tions  was  71,347,  yielding  an  overall  completeness  of 
83.9%. 

Refinement 

The  structure  was  solved  by  the  molecular  replace¬ 
ment  method  using  AMoRe38  with  starting  atomic 
model  (1NTE)  and  was  first  refined  with  a  maximum 
likelihood  target  function  using  a  subset  of  data  to  1.0  A 
resolution  with  REFMAC539  from  the  CCP4  suite.40 
Manual  model  building  was  performed  in  O.41  Further 
refining  with  all  data  to  0.73  A  resolution  was  carried 
out  using  SHELX-9742  with  standard  conjugate  gradient 
refinement  (CGLS).  Crystallographic  details  including 
data  and  refinement  statistics  are  shown  in  Table  2. 
Structural  analysis  was  carried  out  with  the  program 
PROCHECK,15  EdPDB43  and  PARVATI20  and  NCI.25 

Protein  Data  Bank  accession  number 

The  coordinates  and  the  structural  factors  have  been 
deposited  with  the  RCSB  Protein  Data  Bank  under 
accession  number  1R6J. 
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Summary 

Full  understanding  of  the  mechanisms  of  function  of  multidomain  proteins  requires 
knowledge  of  their  supramodular  architecture  in  solution.  This  is  a  non-trivial  task  for 
either  X-ray  crystallography  and  NMR,  because  intrinsic  flexibility  makes  crystallization 
of  these  proteins  difficult,  while  their  size  creates  a  challenge  for  NMR  impractical.  We 
here  describe  synergistic  application  of  data  derived  from  X-ray  crystallography  and 
residual  dipolar  couplings  (RDCs),  to  address  the  question  of  the  supramodular  structure 
of  syntenin,  a  32  kDa  protein  containing  two  PDZ  domains  and  involved  in  cytoskeleton- 
membrane  organization.  We  show  that  the  mutual  disposition  of  the  PDZ  domains  clearly 
differs  from  that  seen  in  the  crystal  structure  and  we  provide  evidence  that  N-  and  C- 
terminal  fragments  of  syntenin,  hitherto  presumed  to  lack  ordered  structure,  contain 
folded  structural  elements  in  the  full-length  protein  in  contact  with  the  PDZ  tandem. 


Introduction 

A  large  fraction  of  genes  in  eukaryotic  genomes  codes  for  large,  multidomain  proteins, 
many  of  which  are  critically  responsible  for  complex  pathways  of  cell  regulation.  While 
X-ray  crystallography  and  NMR  have  been  successful  at  structural  characterization  of 
isolated  domains  and  their  binary  complexes,  the  task  of  characterization  of  solution 
structures  of  full-length,  multidomain  proteins,  is  an  immense  challenge  to  the  two 
techniques.  The  intrinsic  flexibility  of  most  multidomain  proteins  makes  it  very  difficult 
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to  grow  diffracting  single  crystals,  unless  they  can  be  induced  to  form  a  'closed',  compact 
conformation.  There  is  also  the  danger  that  crystal  packing  forces  may  distort  the 
structure  of  a  protein  with  a  high  degree  of  intrinsic  flexibility,  or  stabilize  a  conformer 
which  is  not  representative  of  the  structure  in  solution.  On  the  other  hand,  while  not 
dependent  on  crystalline  samples,  NMR  methodology  becomes  very  labor-intensive  and 
expensive  with  increasing  molecular  weight  of  the  protein.  Clearly,  a  better 
understanding  of  the  behavior  of  multidomain  proteins  in  solution  requires  development 
of  alternative  approaches,  including  a  synergistic  use  of  X-ray  diffraction  models  and 
NMR-derived  data.  Such  an  approach  has  been  made  possible  with  the  introduction  of 
residual  dipolar  couplings  (RDCs)  as  an  alternative  and  complementary  approach  greatly 
expanding  the  scope  of  NMR  methodology  (Prestegard,  et  al.,  2000,  Bax,  et  al.,  2001,  de 
Alba  and  Tjandra,  2002,  Bax,  2003).  The  RDCs  make  it  possible  to  determine  the 
orientation  of  specific  bonds,  such  as  *H-15N  peptide  moieties  for  partially  oriented 
molecules.  While  the  determination  of  RDCs  still  requires  full  assignment  of  chemical 
shifts,  the  spectra  can  be  analyzed  in  a  significantly  shorter  time  than  standard  NOE 
based  experiments.  Furthermore,  unlike  the  chemical  shifts  which  cannot  be  predicted 
with  very  high  accuracy,  RDCs  can  easily  be  computed  based  on  a  set  of  crystallographic 
coordinates,  making  it  easy  to  combine  crystallographic  and  NMR  data  to  generate  a 
comprehensive  description  of  structural  properties  of  a  given  protein  in  solution. 

In  this  report  we  describe  the  use  of  this  novel  approach  to  characterize  the 
solution  supramodular  architecture  of  syntenin,  a  32  kDa  scaffolding  protein  which 
contains  two  PDZ  domains  arranged  in  a  tandem,  and  N-  and  C-terminal  fragments, 
hitherto  presumed  to  be  unstructured  based  on  secondary  structure  prediction.  PDZ 
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domains — between  80  and  90  amino  acids  in  size — are  the  most  ubiquitous  protein- 
protein  interaction  domains,  and  are  found  exclusively  in  cytoplasmic,  multidomain 
proteins  often  in  conjunction  with  one  or  more  domains  of  other  types  (Nourry,  et  al., 
2003).  They  typically  function  by  selective  binding  of  C-terminal  oligopeptides  of  other 
proteins,  notably  channels  and  receptors  (Doyle,  et  al.,  1996,  Hillier,  et  al.,  1999),  and 
thus  play  a  critical  role  in  the  organization  of  signaling  complexes  (Fan  and  Zhang, 

2002).  Syntenin  was  originally  identified  as  a  binding  partner  of  the  cytoplasmic  domain 
of  vertebrate  syndecans  (Grootjans,  et  al.,  1997).  It  is  widely  expressed,  strongly 
associated  with  membranes,  and  plays  a  role  in  cytoskeleton-membrane  organization 
(Zimmermann,  et  al.,  2001),  protein  trafficking,  cell  adhesion,  and  activation  of  the 
transcription  factor  Sox4  (Geijsen,  et  al.,  2001),  through  interactions  with  a  number  of 
other  proteins  including  ephrins  (Lin,  et  al.,  1999)  and  neurexins  (Lin,  et  al.,  1999, 
Grootjans,  et  al.,  2000).  The  PDZ  tandem  of  syntenin  and  the  isolated  second  (PDZ2) 
domain  have  been  studied  by  X-ray  crystallography  (Kang,  et  al.,  2003,  Kang,  et  al., 

2003,  Kang,  et  al.,  2004).  These  studies  revealed  that  the  PDZ  tandem  forms  a  head-to- 
tail  homodimer,  suggesting  that  dimerization  may  be  a  biologically  relevant  phenomenon, 
in  concert  with  some  other  biochemical  data  (Koroll,  et  al.,  2001).  The  crystal  structure 
showed  a  rigid  supramodular  architecture,  with  a  possibility  of  domain-swapping  within 
the  homodimer  (Figure  1). 

Since  the  relative  disposition  of  PDZ  domains  within  multi-PDZ  proteins  is 
increasingly  recognized  as  having  an  important  biological  role,  our  study  was  intended  to 
answer  the  following  unresolved  issues:  (a)  is  full  length  syntenin  dimeric  or  monomeric 
in  solution?  (b)  can  we  detect  domain  swapping  events  in  solution?  (c)  do  the  domains 
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retain  in  solution  the  supramodular  architecture  observed  in  the  crystal?  (d)  do  the  N-  and 
C-termini,  presumed  to  be  unstructured,  contain  any  folded  elements  interacting  with  the 
PDZ  tandem? 

By  combining  the  NMR  derived  information,  based  on  the  RDC  measurements, 
with  the  known  crystal  structures,  we  show  that  in  solution  syntenin  is  monomeric  but  its 
supramodular  structure  is  significantly  different  from  that  seen  in  the  crystal.  We  also 
present  evidence  that  the  C-terminus  affects  the  structure  of  the  tandem  while  the  N- 
terminal  fragment  is  unstructured.  This  investigation  provides  a  powerful  illustration  of 
the  complementary  role  of  crystallography  and  solution  NMR,  and  shows  how  the 
combination  of  these  techniques  allows  one  to  overcome  the  limitations  of  each  of  these 
methods  when  applied  in  isolation. 


Results 

Chemical  shift  analysis  of  syntenin  PDZ  domains 

As  the  first  step  in  the  study  of  the  supramodular  structure  of  syntenin  in  solution,  we 
needed  to  compare  chemical  shifts  of  the  two  isolated  PDZ  domains  and  that  of  the 
tandem.  Each  of  the  two  PDZ  domains,  PDZ1  and  PDZ2,  were  individually  expressed 
and  purified  (see  Materials  and  Methods).  As  a  prerequisite  for  the  RDC  analysis,  it  was 
necessary  to  obtain  reasonably  complete  chemical  shift  assignments  for  amides,  using 
samples  of  15N  labeled  proteins.  This  was  accomplished  based  on  the  analysis  of  15N- 
edited  HSQC-TOCSY,  15N-edited  HSQC-NOESY  and  HNHA  experiments.  The  NMR 
'H-^N  HSQC  spectra  of  both  isolated  PDZ1  and  PDZ2  domains  are  well  dispersed. 
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Assignments  of  !Hn,  15N  and  'Ha  chemical  shifts  were  100  %  and  90  %  complete  for 
PDZ1  and  PDZ2,  respectively.  Interestingly,  the  chemical  shift  assignment  of  the  PDZ2 
domain  revealed  that  the  backbone  amide  of  Asn215  is  strongly  downfield-shifted 
(8hn=  12.66  ppm).  Since  there  are  no  aromatic  rings  in  the  proximity  of  this  amide,  the 
unusually  large  chemical  shift  (one  of  the  largest  amide  chemical  shifts  reported  in  the 
literature)  must  result  from  a  very  short  hydrogen  bond  with  the  side  chain  of  Asp251. 
This  is  further  corroborated  by  a  large  temperature  coefficient  for  the  amide  of  Asn215 
(A5hn/AT=  -5.8ppb/K)  (Cierpicki,  et  al.,  2002).  The  crystal  structure  (PDB  code  ln99) 
shows  that  the  distance  between  the  backbone  nitrogen  of  Asn21 5  and  the  side  chain 
oxygen  of  Asp251  is  2.56A. 

In  order  to  compare  the  isolated  PDZ  domains  to  the  PDZ  tandem  in  solution,  we 
also  measured  the  'H-^N  HSQC  spectra  for  the  tandem.  The  comparisons  of  the  spectra 
indicate  that  most  of  the  amide  chemical  shifts  do  not  vary  significantly  between  the 
isolated  domains  and  the  tandem.  Thus,  the  knowledge  of  the  chemical  shifts  for  the 
isolated  domains  was  of  significant  help  in  the  assignment  of  the  PDZ  tandem,  and 
consequently  96  %  of  the  'Hn,  15N  and  !H“  resonances,  including  those  corresponding  to 
the  linker  region  between  the  PDZ  domains,  were  assigned. 

The  observed  differences  in  !Hn  and  15N  chemical  shifts  between  isolated 
domains  and  the  tandem  are  summarized  in  Figure  1.  Interestingly,  much  stronger 
changes  are  observed  for  PDZ2  relative  to  PDZ1  and  the  largest  differences  exist  for 
PDZ2  residues  213-218  and  234-241.  The  crystal  structure  of  the  PDZ  tandem  shows  that 
three  aromatic  rings  (Phel54,  Phel95  and  Phe273)  are  buried  in  the  surface  between  the 
PDZ  domains  and  thus  ring  current  effects  may  be  the  source  of  some  of  the  observed 
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chemical  shift  changes.  However,  in  spite  of  these  localized  differences,  there  are  no 
substantial  changes  for  residues  within  close  proximity  to  the  peptide  binding  sites. 

The  PDZ  tandem  in  solution 

The  crystal  structure  of  PDZ  tandem  is  consistent  with  a  homodimeric  structure,  arranged 
head-to-tail,  so  that  PDZ1  in  one  monomer  makes  extensive  contacts  with  PDZ2  of  the 
other  monomer  in  the  asymmetric  unit  (Kang,  et  al.,  2003).  Probing  whether  the  dimeric 
or  monomeric  state  is  predominant  in  solution  we  carried  out  measurement  of  15N 
relaxation  times.  The  estimated  correlation  times,  inversely  proportional  to  the  tumbling 
rate  of  the  molecules  in  solution,  determined  for  PDZ1,  PDZ2  and  for  the  tandem  from 
1SN  relaxation  time  ratios  (T1/T2)  are  4.7±0.2, 4.5±0.5  and  10.0±0.5  ns,  respectively.  The 
predicted  values,  assuming  an  axially  symmetric  model  for  the  tandem  are  13  and  26  ns 
for  a  monomeric  and  dimeric  species,  respectively  (see  Materials  and  Methods).  Thus, 
the  experimental  NMR  data  for  the  tandem  are  consistent  with  a  predominantly  monomer 
state  of  the  protein  in  solution. 

In  order  to  gain  qualitative  insight  into  the  backbone  mobility  within  the  tandem, 
we  compared  the  1sN{'H}  NOEs  for  each  isolated  domain  with  those  measured  for  the 
tandem.  A  relatively  uniform  distribution  of  NOEs  was  observed  with  average  values  of 
0.77±0.05,  0.7610.03  and  0.7510.08  for  PDZ1,  PDZ2  and  PDZ  12,  respectively. 
Interestingly,  the  NOE  values  for  the  residues  within  the  linker  region  between  PDZ1  and 
PDZ2  (i.e.  Phel95,  Glul96  and  Argl97)  are  not  diminished.  This  strongly  suggested  that 
the  two  PDZ  domains  have  fixed  relative  orientations  within  the  tandem. 
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The  relative  orientation  of  the  PDZ  domains  in  the  tandem 

Next,  we  determined  the  mutual  dispositions  of  the  two  PDZ  domains  in  the  tandem, 
based  on  the  RDCs.  To  measure  the  latter,  several  samples  with  different  gel 
compositions  were  used  to  find  optimal  conditions  for  inducing  the  weak  alignment.  We 
observed  an  alignment  of  the  tandem  in  compressed  copolymer  polyacrylamide  gels  (see 
Materials  and  Methods).  Two  sets  of  !Dhn  values  were  measured,  and  the  experimental 
values  of  RDCs  ranged  from  -14  to  +15  Hz  and  -1 1.5  to  +14.5  Hz  for  50+M  and  75+M, 
respectively.  However,  the  correlation  coefficient  between  the  two  data  sets  (r)  is  0.97 
and  the  alignments  are  virtually  identical.  Qualitative  compatibility  between  dipolar 
couplings  and  the  corresponding  crystal  structures  were  evaluated  using  the  quality  factor 
Q  (Comilescu,  et  al.,  1998).  A  value  of  zero  indicates  a  perfect  fit,  while  in  reality  a  value 
of  approximately  20%  correspond  to  a  very  good  fit  between  the  two  sets  of  observations 
(see  Materials  and  Methods)  (Ottiger  and  Bax,  1999).  In  order  to  select  structures  of  the 
component  PDZ  domains  that  most  accurately  represent  the  structure  in  solution  we 
compared  the  quality  factors  (Q).  Values  of  Q  calculated  for  individual  domains  from 
PDZ12  using  the  75+M  data  set  are  22.3,  22.1,  28.5  and  22.5  %  for  PDZ1A  PDZ2A, 
PDZ1b  and  PDZ2B,  respectively  (superscripts  denote  monomers  A  and  B  from  the 
tandem  crystal  structure).  Thus,  it  is  most  likely  that  monomer  A  constitutes  the  best 
representation  of  the  structure  in  solution.  Parameters  of  the  alignment  tensors  were 
obtained  from  the  best  fit  of  measured  and  calculated  RDCs  based  on  the  crystal 
structures  of  PDZ1A  and  PDZ2A  domains  (see  Table  I).  Interestingly,  the  alignment  of  the 
individual  domains  is  very  similar,  although  the  orientations  of  PDZ  1  and  PDZ2  are 
noticeably  different.  The  quality  factor  calculated  for  the  crystal  structure  of  tandem 
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(PDZ12a)  is  27.7  %,  significantly  higher  than  that  of  the  individual  domains,  indicating  a 
possibility  of  altered  interdomain  orientation  in  solution. 

We  then  used  the  RDCs  for  each  individual  PDZ  domain  to  determine  the  relative 
orientations  of  these  modules  in  the  PDZ  tandem  in  solution.  We  first  aligned  the 
coordinates  of  the  PDZ  tandem  to  the  alignment  tensor  frame  of  PDZ1,  rotating  the 
crystal  structure  by  -26°,  18°  and  -42°  about  x,  y  and  z  axes,  respectively.  Under  those 
circumstances,  the  alignment  tensor  of  PDZ2  is  rotated  with  respect  to  that  of  PDZ  1  by  - 
5°,  3°  and  -23°  around  the  x,  y  and  z  axes  respectively  (Table  I).  Application  of  the  above 
transformation  to  PDZ2  yielded  the  architecture  of  PDZ  tandem  that  is  consistent  with 
the  dipolar  couplings  recorded  in  solution.  The  resulting  quality  factor  calculated  for  the 
tandem  upon  reorientation  of  the  PDZ  domains  dropped  from  27.7  to  22.7  %  and  is 
similar  to  that  calculated  for  the  individual  domains.  The  magnitude  of  the  angular  errors 
in  domain  orientation  calculated  from  the  residual  dipolar  couplings  has  been  evaluated 
using  the  jack-knife  procedure  (Mosteller  and  Tukey,  1977)  and  through  a  comparison  of 
two  independent  sets  of  dipolar  couplings.  The  two  data  sets  are  very  similar  and  the 
differences  do  not  exceed  1 .5°  for  rotation  about  each  of  the  angles. 

Since  the  analysis  of  the  RDCs  from  a  single  alignment  gives  rise  to  a  fourfold 
degeneracy  in  the  fragment  orientations,  we  had  to  consider  the  possibility  of  an  alternate 
interdomain  arrangement.  Thus,  we  constructed  domain  swapped  tandem  represented  by 
PDZ1a  and  PDZ2B  and  we  assessed  the  compatibility  of  this  model  with  the 
experimentally  determined  RDCs.  The  agreement  was  poor,  as  judged  by  the  high  value 
of  the  quality  factor  (72.8  %).  Furthermore,  the  domain  swapped  structure  is  less 
consistent  with  the  chemical  shift  data  (Figure  1). 
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Supramodular  structure  of  PDZ  tandem 

Although  the  RDCs  allow  for  an  accurate  determination  of  the  relative  orientations  of  the 
domains,  they  do  not  provide  information  regarding  the  relative  translations.  To  obtain  a 
complete  structural  description  of  the  tandem  it  is  necessary  to  obtain  additional 
information  with  respect  to  the  interdomain  interface.  Such  information  can  be  derived 
from  the  comparison  of  chemical  shifts  between  isolated  PDZ  domains  and  the  PDZ 
tandem  (Figure  1)  and  their  use  in  the  form  of  ambiguous  distance  restraints  for  structure 
calculations  (Clore  and  Schwieters,  2003). 

Two  independent  simulations  employing  two  sets  of  dipolar  couplings  resulted  in 
very  similar  structures  of  the  tandem  (Figure  2A).  An  extensive  set  of  intradomain 
restraints  allowed  us  to  preserve  a  rigid  backbone  structure  during  the  simulation.  The 
root-mean-square  deviations  between  main  chain  atoms  of  the  ten  lowest  energy 
conformers  of  PDZ  1  and  PDZ2  relatively  to  crystal  structure  are  0.10  A  and  0.17  A, 
respectively.  While  the  relative  orientation  of  PDZ  domains  was  restricted  during  the 
simulation  due  to  the  use  of  dipolar  couplings,  the  key  point  to  obtain  a  compact  structure 
of  the  PDZ  12  was  the  application  of  ambiguous  distance  restraints  derived  from  chemical 
shifts  analysis.  The  ambiguous  distance  restraint  was  not  violated  more  than  0.2  A  in  the 
final  structures  of  PDZ  12  and  there  are  no  steric  clashes  at  the  interface  between  the  PDZ 
domains.  Nevertheless,  we  observed  that  in  three  out  of  ten  structures  calculated  using 
the  75+M  set  of  RDCs,  one  domain  is  translationally  shifted  relative  to  the  second 
domain  by  4  A  (not  shown).  Highly  ambiguous  distance  restraints  constructed  on  the 
basis  of  chemical  shift  comparison  do  not  discriminate  between  interdomain  contact  and 
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small  intradomain  structural  changes.  Thus,  they  may  contain  erroneous  information 
eventually  leading  to  a  set  of  incorrect  structures.  Such  errors  can  be  eliminated  by 
inspecting  all  sets  of  calculated  structures.  The  translational  movement  is  very  unlikely  to 
occur  since  it  decreases  the  contact  area  between  the  PDZ  domains.  The  three  incorrect 
structures  were  accordingly  eliminated  from  further  analysis.  Such  structures  have  not 
been  observed  in  calculations  using  a  second  set  of  RDC  (50+M).  Comparison  of  the 
tandem  structures  calculated  using  50+M  RDC  data  (magenta)  and  75+M  (green)  is 
shown  in  Figure  2A. 

A  superposition  of  a  representative  solution  conformer  of  PDZ  12  (blue)  and  the 
crystal  structure  of  PDZ12_A  (green  -  monomer  A  from  the  crystal  structure)  is  shown  in 
Figure  2B.  Relative  rotation  of  PDZ2  is  clearly  visible  and  root-mean-square  deviation 
between  main  chain  atoms  of  the  two  PDZ2  domains  is  3.9  A.  As  a  consequence  a 
number  of  backbone  atoms  are  translationally  shifted  by  more  than  5  A  relative  to  the 
crystal  structure  (calculate  RMSD). 

In  order  to  examine  whether  isolated  PDZ  domains  are  able  to  interact  in  solution 
we  performed  an  NMR  titration  experiment  in  which  unlabelled  PDZ2  was  added  to  15N 
labeled  PDZ1  in  a  1:1  molar  ratio  (data  not  shown).  Since  no  chemical  shifts  of  PDZ  1 
were  affected,  we  concluded  that  the  covalent  linker  between  the  domains  is  necessary  to 
maintain  the  contact  between  the  PDZ  domains. 

The  N-  and  C-termini 

Structural  analysis  of  modular  proteins  is  often  limited  to  domains  with  well  defined 
tertiary  structures.  While  the  protein-protein  interactions  are  frequently  mediated  by  these 
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globular  domains,  the  presence  of  full-length  intact  proteins  including  unfolded 
fragments  is  necessary  for  the  expression  of  full  biological  function.  Detailed  study  of 
such  unstructured  fragments  is  difficult.  In  the  case  of  syntenin  over  45  %  of  the 
polypeptide  chain  extends  beyond  the  domain  boundaries.  In  order  to  address  the 
question  of  whether  the  terminal  fragments  of  the  protein  form  any  contacts  with  PDZ1 
and  PDZ2  domains,  possibly  modulating  their  structure  and  interactions,  we  prepared 
15N-labeled  full-length  syntenin  (FL).  Despite  the  large  size  of  the  FL  protein,  the  'H-15N 
HSQC  spectrum  shows  relatively  narrow  signals  corresponding  to  the  PDZ12  fragment. 
Additionally,  the  severely  overlapped  region  in  the  center  of  the  spectrum  is 
characteristic  for  the  unstructured  N-  and  C-termini  (Figure  3).  Interestingly,  the  detailed 
comparison  of  full  length  protein  and  PDZ12  spectra  indicates  distinct  differences 
including  an  increase  in  the  number  of  signals  in  the  folded  region  (see  below). 

In  order  to  evaluate  the  effects  of  the  N-  and  C-termini,  we  analyzed  two 
additional  syntenin  constructs:  N-PDZ12  (residues  1-273)  and  PDZ12-C  (residues  1 13- 
298).  Comparison  of ’H-15N  HSQC  spectra  for  all  syntenin  fragments  measured  under 
identical  conditions  is  shown  in  Figure  3.  Spectrum  of  N-PDZ12  is  very  similar  to  the 
PDZ  tandem  and  contains  a  number  of  additional  resonances  arising  from  the  N-terminus 
(as  seen  for  FL  in  Figure  4).  All  these  extra  signals  are  found  to  lie  within  random-coil 
positions  indicating  that  this  fragment  is  unstructured.  However,  addition  of  the  N- 
terminal  extension  causes  numerous  shifts  for  resonances  within  the  PDZ  12  fragment 
(Figure  4A).  In  order  to  unambiguously  assign  these  changes  we  prepared  an  additional 
sample  by  mixing  equimolar  amounts  of  15N-labelled  N-PDZ12  and  PDZ12.  In  such  an 
experiment,  two  peaks  with  diminished  intensities  should  be  observed  for  all  amides  that 
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differ  in  chemical  shifts  between  PDZ12  and  N-PDZ12.  We  analyzed  all  signals  that 
were  not  overlapped  with  the  N-terminus  and  found  that  all  amides  with  affected 
chemical  shifts  belong  to  PDZ1  and  lie  in  close  proximity  to  the  N-terminus.  Thus,  in 
spite  of  the  fact  that  numerous  amides  have  been  affected  it  is  most  likely  that  all 
perturbations  of  chemical  shifts  are  caused  by  the  attachment  of  the  N-terminal 
polypeptide  that  is  most  likely  unstructured  and  does  not  interfere  with  the  PDZ  tandem. 

Unlike  the  N -PDZ  1 2,  the  spectrum  of  PDZ  1 2-C  shows  numerous  strongly 
broadened  resonances  (Figure  4B).  Furthermore,  the  sample  of  PDZ12-C  is  less  stable  in 
solution  and  tends  to  precipitate  even  at  low  concentration.  The  observed  signal 
broadening  may  indicate  an  increased  aggregation  propensity  of  the  protein  upon  addition 
of  the  25  C-terminal  residues  to  the  tandem.  The  spectrum  of  PDZ  1 2-C  indicates 
numerous  chemical  shift  changes  relative  to  the  PDZ  tandem  alone  (Figure  4B);  however, 
due  to  strong  signal  broadening  a  reliable  assignment  of  the  affected  residues  is  not 
possible.  Interestingly,  full  length  syntenin  is  more  stable  relative  to  PDZ  1 2-C  and  a  high 
resolution  spectrum  was  measured  (Figure  3).  A  comparison  of  the  ]H-15N  HSQC  spectra 
of  the  full  length  protein  and  PDZ  12  reveals  broader  signals  (Figure  4C),  but  this  effect 
can  be  attributed  to  a  larger  molecular  weight  of  full  length  syntenin  (32  kDa  relatively  to 
19  kDa  for  PDZ  12).  The  most  interesting  feature  of  FL  spectrum  is  that  roughly  all 
resonances  are  shifted  relative  to  PDZ  12.  Furthermore,  we  could  identify  additional 
signals  indicating  that  most  likely  several  residues  outside  the  PDZ  domains  are 
structured  as  well.  For  example,  the  spectrum  of  the  full-length  protein  reveals  the 
presence  of  an  upfield-shifted  amide  with  a  proton  chemical  shift  of  6.34  ppm  (Figure  3). 
The  comparison  of  the  different  fragments  and  the  full  length  protein  shows  that  the 
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presence  of  both  termini  is  necessary  for  maintaining  the  proper  structure  of  syntenin.  It 
is  likely  that  residues  from  the  C-terminal  fragment  of  syntenin  as  well  as  possibly  the  N- 
terminus  are  structured  and  interact  with  the  PDZ  domains. 

Discussion 

Regulatory  and  scaffolding  proteins  often  contain  multiple  PDZ  domains,  frequently 
closely  spaced  along  the  sequence  (Jelen,  et  al.,  2003,  Nourry,  et  ah,  2003).  Such 
architecture  is  important  for  their  function,  as  has  been  demonstrated — for  example — for 
the  first  two  PDZ  domains  of  PSD-95,  a  protein  mediating  ion  channel  clustering 
(Imamura,  et  ah,  2002).  At  least  some  multi-PDZ  proteins  have  defined  supramodular 
architectures  and  their  interactions  with  binding  partners  display  a  higher  level  of 
complexity  (Feng,  et  ah,  2003,  Im,  et  ah,  2003,  Long,  et  ah,  2003).  Furthermore, 
sequence  analyses  indicate  that  PDZ  domains  are  frequently  connected  by  short  linkers  of 
conserved  length,  which  may  impose  constraints  on  the  mutual  disposition  of  the  adjacent 
domains.  The  structure  of  the  PSD-95  PDZ  domain  tandem  revealed  restricted  orientation 
of  PDZ  domains  leading  to  concerted  orientation  of  peptide  binding  sites  (Long,  et  ah, 
2003).  Similarly,  the  solution  structure  of  the  tandem  of  PDZ4  and  PDZ5  from  the 
glutamate  receptor  interacting  protein  (GRIP)  revealed  that  PDZ  domains  form  a  compact 
structure  with  a  fixed  interdomain  orientation  (Feng,  et  ah,  2003).  Supramodular 
architecture  may  also  result  from  multimerization  of  PDZ  domains.  The  recently  reported 
structure  of  the  Shank  PDZ  domain  shows  a  tightly  associated  dimer  that  is  maintained 
both  in  solution  and  in  the  crystal  (Im,  et  ah,  2003).  A  dimeric  association  may  facilitate 
binding  to  dimeric  ligands  such  as  pPIX  (Im,  et  ah,  2003).  Despite  their  small  size  and 
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simple  structure,  PDZ  domains  display  a  broad  spectrum  of  interaction  modes,  the 
knowledge  of  which  is  crucial  for  the  understanding  of  the  architecture  of  signaling 
complexes. 

In  this  report,  we  address  the  question  of  the  domain  orientation  and  supramodular 
structure  of  the  syntenin’s  PDZ  tandem,  as  determined  using  residual  dipolar  couplings 
and  chemical  shifts.  Our  study  shows  that  the  PDZ  tandem  is  monomeric  in  solution  and 
that  the  two  domains  tumble  as  a  single  unit  with  a  rotational  correlation  time  of  10  ns. 
The  mutual  arrangement  of  the  PDZ  domains  in  solution  has  also  been  determined  from 
residual  dipolar  couplings.  While  it  is  similar  to  that  seen  in  the  crystal  structure,  the 
domains  are  rotated  approximately  -5°,  3°,  -23°  about  the  x,  y,  z  axes,  respectively  (Figure 
2B).  This  difference  leads  to  a  rearrangement  of  the  buried  surfaces  at  the  PDZ1/PDZ2 
interface.  In  the  crystal  structure  of  PDZ  tandem  this  interface  is  actually  much  smaller 
than  the  intermolecular  contact,  i.e.  the  interface  between  the  PDZ1  and  PDZ2  that 
belong  to  the  different  monomers  (Kang,  et  al.,  2003).  While,  the  intramolecular  surface 
area  buried  between  the  two  PDZ  domains  within  monomers  A  and  B  is  283  and  340  A2, 
respectively,  the  intermolecular  contacts  are  almost  twice  as  extensive  (547  A2  and  573 
A2  for  PDZ1a-PDZ2b  and  PDZ1B-PDZ2A,  respectively).  Thus,  reorientation  of  the 
domains  observed  for  the  solution  structure  is  most  likely  driven  by  a  significant  increase 
of  the  intramolecular  contact  surface  between  PDZ1  and  PDZ2,  which  we  estimate  using 
two  independent  sets  of  dipolar  couplings  to  be  -480  A2.  Clearly,  the  crystallization 
process  and  crystal  packing  forces  select  a  conformation  which  is  significantly  different 
from  the  one  in  solution.  Similar  effects  have  been  reported  in  other  cases.  For  example, 
the  solution  structure  of  lysozyme  derived  from  RDCs  differs  from  the  crystal  structures 
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and  the  cleft  between  the  two  domains  is  significantly  larger  in  solution  (Goto,  et  al., 
2001).  Similarly,  the  solution  state  of  MBP/p-cyclodextrin  complex  has  been  found  to  be 
~10°  more  closed  than  the  crystalline  state  (Evenas,  et  al.,  2001).  In  it  clear  that 
synergistic  application  of  residual  dipolar  couplings  and  crystal  structures  of  individual 
domains  leads  to  a  much  more  accurate  determination  of  the  architecture  of  modular 
proteins  in  solution. 

Our  work  also  illustrates  the  structural  consequences  of  the  presence  of  partly 
unfolded  fragments  in  the  protein.  This  is  supported  by  the  NMR  spectra  of  the  full- 
length  syntenin.  While  the  addition  of  the  N-terminal  fragment  affects  only  signals  within 
PDZ1,  the  presence  of  the  C-terminal  fragment  leads  to  a  strong  chemical  shift  changes 
accompanied  with  a  severe  signal  broadening  effect  as  a  result  of  aggregation.  The  full 
length  syntenin  has  a  noticeably  lower  tendency  to  aggregate  and  consequently  high 
resolution  spectra  can  be  recorded.  The  comparison  of  the  'H-15N  HSQC  spectra  of  the 
full-length  protein  relative  to  those  of  the  tandem  reveals  an  increased  number  of  amide 
resonances  with  non-random  coil  chemical  shifts.  This  indicates  that  there  are  fragments 
extraneous  to  the  PDZ  domains  that  are  structured  in  the  full-length  protein.  It  is  likely 
that  this  involves  the  C-terminal  fragment,  although  an  interaction  with  the  N-terminus 
should  not  be  ruled  out.  This  observation  is  in  agreement  with  thermodynamic  studies 
previously  reported  for  syntenin  which  show  that  the  full  length  protein  unfolds 
cooperatively  and  is  more  stable  by  2  kcal/mol  relative  compared  to  the  isolated  tandem 
(Kang,  et  al.,  2003).  Whether  or  not  this  effect  modulates  syntenin’s  function  and/or 
peptide  binding  requires  further  study. 
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Experimental  Procedures 
Theory 

The  residual  dipolar  couplings  (RDCs)  emerge  from  incomplete  averaging  of  dipole- 
dipole  interactions  in  solution  NMR  and  provide  information  about  orientation  of 
intemuclear  vectors  relative  to  the  magnetic  field  (Prestegard,  et  al.,  2000,  Bax,  et  al., 
2001,  de  Alba  and  Tjandra,  2002,  Bax,  2003).  RDC  between  pairs  of  nuclei  is  defined  by: 


DJS  =  fhhZlZs  ^a(3cos2^-l)+  —  Arsm2  0cos2<f> 

1  6tt rls  L  2 

where  yi  and  ys  are  gyromagnetic  ratios  of  interacting  nuclei,  ris  is  a  distance  between  I 
and  S,  #and  <j>  describe  orientation  of  IS  intermolecular  vector  relatively  to  molecular 
alignment  tensor  and  Aa  and  Ar  are  axial  and  rhombic  components  of  the  alignment 
tensor,  respectively.  The  dipolar  coupling  between  a  pair  of  'H  and  ,5N  nuclei  can  be 
simplified  as  follows: 


DIS  —  Da 


(3  cos2  6  - 1)+  ^  R  sin 2  9  cos  2<j> 


where  Da  is  the  magnitude  of  the  alignment  tensor  normalized  to  interaction  between  *HN 
and  15N  nuclei  and  R  is  rhombicity.  Parameters  of  the  alignment  tensor  (Da  and  R)  can  be 
obtained  from  best  fitting  of  experimental  RDC  to  the  known  structure  (Losonczi,  et  al., 
1999).  This  procedure  additionally  yields  the  Euler  angles  (a,  P,  y)  defining  rotation  of 
molecular  coordinates  relative  to  the  alignment  tensor  frame.  Thus,  a  combination  of 
RDCs  and  structures  determined  by  either  NMR  or  crystallographic  analysis  may  be 
conveniently  used  to  establish  the  relative  orientation  of  molecular  fragments  in  solution 
(Fischer,  etal.,  1999). 
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Measurement  of  RDCs  is  possible  only  under  the  conditions  of  anisotropic 
tumbling  of  molecules  in  solution.  So  far,  the  most  successful  methods  used  for  obtaining 
weak  alignment  of  macromolecules  are  based  on  diluted  liquid  crystalline  media  (Bax 
and  Tjandra,  1997)  and  filamentous  phages  (Clore,  et  al.,  1998).  More  recently,  strained 
polyacrylamide  based  hydrogels  were  introduced  as  inert  and  very  stable  orienting  media 
(Sass,  et  al.,  2000,  Ishii,  et  al.,  2001).  Further  modifications  of  the  gel  composition  makes 
possible  the  preparation  of  charged  copolymers  that  are  stable  at  low  concentrations  and 
are  suitable  for  studying  large  proteins  (Meier,  et  al.,  2002,  Ulmer,  et  al.,  2003). 

The  orienting  media  should  be  carefully  chosen  in  order  to  avoid  strong 
interactions  with  the  protein.  In  our  initial  studies  we  were  unsuccessful  with  the 
DMPC:DHPC  bicelles  (Bax  and  Tjandra,  1997)  or  negatively  charged  polyacrylamide 
copolymer  gels  (Meier,  et  al.,  2002),  due  to  very  strong  broadening  of  protein  signals. 
However,  we  were  able  to  utilize  the  recently  developed  positively  charged  copolymer 
gels  with  composition  of  50%  (3-acrylamidopropyl)-trimethylammonium  chloride  -  50% 
acrylamide  (referred  to  hereafter  as  50+M)  and  75%  (3-acrylamidopropyl)- 
trimethylammonium  chloride  -  25%  acrylamide  (75+M). 

Protein  purification 

Six  different  syntenin  fragments  have  been  prepared:  full  length  protein  (1-298),  N- 
PDZ12  (1-273),  PDZ1  (1 13-193),  PDZ2  (197-273),  PDZ12-C  (113-298).  All  proteins 
were  uniformly  15N-labeled  by  overexpression  in  minimal  media  containing  (15NH4)2SC>4 
as  a  sole  nitrogen  source.  Purification  of  GST-fused  proteins  has  been  carried  out 
according  to  previously  published  protocol  (Kang,  et  al.,  2003).  All  NMR  samples 
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contained  between  0.2  and  0.6  mM  protein,  1  mM  DTT  and  150  mM  NaCl  in  either  50 
mM  phosphate  buffer,  pH  6.5  or  50  mM  TRIS  buffer,  pH  7.5. 

NMR  spectroscopy 

NMR  spectra  were  collected  using  Varian  Inova  500  and  600  MHz  spectrometers.  NMR 
experiments  have  been  typically  collected  at  30°  C.  Chemical  shift  assignment  was  based 
on  a  series  of  3D  experiments:  lsN-separated  HSQC-TOCSY  (Zhang,  et  al.,  1994),  1SN- 
separated  HSQC-NOESY  with  150  ms  mixing  time  (Zhang,  et  al.,  1994)  and  HNHA 
(Vuister  and  Bax,  1993).  Measurement  of  residual  dipolar  couplings  was  based  on  2D 
IPAP-type  'H-^N  HSQC  spectra  (Ottiger,  et  al.,  1998).  Transformation  of  and  analysis  of 
NMR  spectra  was  carried  out  in  NMRPipe  (Delaglio,  et  al.,  1995)  and  Sparky  (Goddard, 
T.  D.  &  Kneller,  J.  M.  University  of  California,  San  Francisco)  programs. 

Relaxation  and  dynamics 

A  standard  set  of  experiments  at  600  MHz  was  used  to  collect  data  for  determination  of 
15N  Tl,  T2  relaxation  times  and  ,5N{1H}  NOE  (Farrow,  et  al.,  1994).  Measurements 
have  been  carried  out  for  0.5  mM  proteins  in  50  mM  Tris  buffer,  pH  6.5,  150  mM  NaCl 
with  addition  of  1  mM  DTT.  Two  sets  of  measurements  for  PDZ1,  PDZ2  and  PDZ12 
have  been  carried  out  at  25  and  30°C.  Tl  relaxation  was  obtained  from  a  series  of 
experiments  with  10,  80,  160,  240,  400,  650,  900  and  1400  ms  time  delays.  For  T2 
measurements  seven  delays  were  used:  10,  30,  50,  90, 130, 170, 230  ms.  ^N^H}  NOE 
was  obtained  by  recording  one  experiment  with  very  little  excitation  of  protons  and  a 
second  experiment  at  saturating  power  with  3  s  irradiation  period  (Farrow,  et  al.,  1994). 
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Experimental  correlation  times  were  calculated  from  R2/R1  ratio  using  r2rl_tm  program 
(Lee,  et  al.,  1997).  Prediction  of  global  correlation  times  for  monomeric  and  dimeric 
structures  of  PDZ  tandem  were  obtained  using  the  HYDRONMR  program,  assuming 
axially  symmetric  models  (Garcia  de  la  Torre,  et  al.,  2000). 

Alignment  in  polyacrylamide  gels 

Weak  protein  alignment  was  achieved  in  compressed  copolymer  polyacrylamide  gels. 
Two  types  of  positively  charged  copolymers  have  been  used:  50%  (3-acrylamidopropyl)- 
trimethylammonium  chloride  -  50%  acrylamide  (referred  to  hereafter  as  50+M)  and  75% 
(3-acrylamidopropyl)-trimethylammonium  chloride  -  25%  acrylamide  (75+M).  Vertical 
gel  compression  has  been  achieved  using  Shigemi  plunger  (Sass,  et  al.,  2000,  Ishii,  et  al., 
2001).  The  details  of  this  method  will  be  described  elsewhere. 

Analysis  of  RDC 

Values  of  'Dhn  were  calculated  from  the  difference  between  coupling  constants  measured 
in  the  absence  and  the  presence  of  the  gel.  Determination  of  alignment  tensor  parameters 
(Da  and  R)  and  Euler  angles  (a,  P,  y)  defining  rotations  of  molecular  coordinates  about  x, 
y  and  z  axes,  relative  to  principal  axes  frame  of  alignment  tensor  was  carried  out  using 
the  PALES  program  (Zweckstetter  and  Bax,  2000)  and  the  crystal  structure  of  PDZ 
tandem  (PDB  code  1N99).  Compatibility  between  experimental  and  calculated  RDC  was 
evaluated  based  on  quality  factors  Q  calculated  from  the  formula:  Q=rms(Dca,c- 
Dobs)/rms(Dobs)  (Cornilescu,  et  al.,  1998). 


20 


Analysis  of  domain  orientation 

Initially,  we  carried  out  fitting  of  experimental  ]Dhn  to  crystal  structures  of  individual 
PDZ  domains.  For  calculation  of  quality  factors  we  selected  PDZ12A  as  the  best 
representative  of  PDZ  domain  structures  in  solution.  All  subsequent  rotations  have  been 
performed  relative  to  the  coordinate  system  from  the  original  crystal  structure  with  the 
origin  located  near  the  linker  between  PDZ1  and  PDZ2.  Euler  angles  determined  for 
PDZ1  were  used  to  rotate  the  tandem  structure  to  the  principal  axis  frame  of  PDZ  1 
alignment  tensor.  For  the  transformed  coordinates  we  calculated  Euler  angles  for  PDZ2 
defining  the  rotation  necessary  to  match  alignment  of  PDZ  1.  Applying  the  transformation 
to  PDZ2  only  we  obtained  tandem  structure  with  reoriented  domains  that  is  consistent 
with  RDC  data.  Evaluation  of  the  error  in  determination  of  alignment  tensor  and  domain 
orientation  has  been  done  using  jack-knife  procedure  (Mosteller  and  Tukey,  1977)  by 
performing  100  cycles  of  calculation  with  random  elimination  of  10  %  of  the  data. 

Structure  calculation 

The  calculation  of  the  syntenin  tandem  structure  has  been  carried  out  using  low 
temperature  simulated  annealing  using  rigid  body  structures  of  individual  domains  (Goto, 
et  al.,  2001)  with  addition  of  ambiguous  interdomain  distance  restraints  derived  from 
'Hn/15N  chemical  shift  changes  (Clore  and  Schwieters,  2003).  In  order  to  keep  a  rigid 
backbone  conformation  of  the  PDZ  domains,  we  applied  tight  distance  and  dihedral  angle 
restraints  derived  from  the  crystal  structures.  Separate  sets  of  constraints  for  PDZ1 
(residues  1 13-193)  and  PDZ2  (residues  198-270)  were  generated  in  the  program  Molmol 
(Koradi,  et  al.,  1996).  Distance  constraints  were  created  for  all  pairs  of  backbone  N,  C“ 
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and  C  atoms  separated  by  less  than  6  A  and  the  dihedral  angles  were  composed  for  <|>,  \| / 
and  to  angles.  Furthermore,  we  included  a  list  of  residual  dipolar  couplings  and 
ambiguous  distance  restraints  describing  interdomain  contacts.  In  order  to  transform 
information  from  chemical  shift  analysis  into  highly  ambiguous  distance  restraints,  we 
used  a  procedure  similar  to  that  proposed  by  Clore  and  Schwieters  (Clore  and  Schwieters, 
2003).  The  difference  in  chemical  shifts  between  isolated  PDZ  domains  and  the  tandem 
have  been  calculated  using  the  following  formula:  A8=|A8hn|+0.1  1*|A8n|.  Highly 
ambiguous  distance  restraints  were  created  between  all  amides  from  PDZ1  and  PDZ2 
with  A8>0.08  ppm.  For  the  constructed  distance  we  arbitrarily  assigned  lower  and  upper 
bounds  of  2  and  5  A,  respectively. 

All  calculations  were  performed  using  CNS  (Brunger,  et  al.,  1998)  with  similar 
parameters  to  those  described  previously  (Goto,  et  al.,  2001).  Torsion  angle  dynamics  at 
200  K  for  15  ps  and  3  fs  time  step  was  followed  by  a  slow  cooling  stage  in  5  K  steps 
during  a  1 50  ps  simulation.  Van  der  Waals  scaling  factor  was  ramped  from  0.1  to  1 
kcal/mol,  while  force  constant  for  dipolar  couplings  was  ramped  from  0.05  to  0.5 
kcal/mol  Hz2.  Distances  and  dihedral  angles  were  restrained  with  a  force  constant  of  200 
kcal/mol  A2  and  500  kcal/mol  rad2,  respectively.  At  a  final  stage  the  structures  were 
minimized  with  ten  cycles  of  conjugate  gradient  minimization. 

The  initial  model  of  the  PDZ  tandem  oriented  using  RDCs  has  been  used  to 
calculate  100  structures.  A  family  of  ten  structures  with  the  best  agreement  between 
experimental  and  calculated  RDCs  has  been  selected  for  analysis.  No  distance  violations 
larger  than  0.2  A  or  dihedral  angle  violations  exceeding  5°  were  observed  in  the 
calculated  structures.  Separate  calculations  were  performed  for  each  set  of  RDCs. 
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Figure  Legends 


Figurel.  Crystal  structure  of  the  syntenin  PDZ  tandem  dimer  with  residues  colored 
according  to  the  extent  of  differences  in  chemical  shifts  between  isolated  PDZ  domains 
and  in  the  tandem.  Chemical  shift  differences  were  calculated  according  to  the  formula 
A8=|A5hn|+0.1  1  *|A8n|.  Red  (PDZ1)  and  blue  (PDZ2)  correspond  to  residues  with  the 
largest  chemical  shift  changes  (A8>0.16  ppm)  while  yellow  (PDZ1)  and  cyan  (PDZ2) 
indicate  smaller  differences  (0.16>AS>0.08  ppm). 

Figure  2.  Solution  structure  of  the  syntenin  PDZ  tandem.  A)  superposition  of  17  lowest 
energy  structures  calculated  using  two  sets  of  residual  dipolar  couplings:  50+M  (magenta, 
10  structures)  and  75+M  (green,  7  structures).  B)  difference  in  PDZ2  orientation  between 
the  crystal  structure  of  the  PDZ  tandem  (green)  and  the  structure  in  solution  (blue).  PDZ2 
is  rotated  by  -5°,  3°  and  -23°  about  x,  y  and  z  axes,  respectively,  in  the  indicated 
coordinate  system.  All  structures  were  superimposed  on  the  PDZ1  domain  (backbone 
residues  113-193). 

Figure  3.  The  'H-^N  HSQC  spectrum  of  32  kDa  full-length  syntenin  measured  at  30°C 
for  0.2  mM  protein  in  50  mM  Tris  buffer,  pH  7.5  and  150  mM  NaCl.  The  yellow  field 
covers  the  severely  overlapped  region  containing  unstructured  N-terminal  residues.  The 
upfiled  shifted  signal  at  6.34  ppm  (green)  is  present  only  in  the  spectrum  of  full-length 
syntenin. 
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Figure  4.  Comparison  of  ]H-15N  HSQC  spectra  of  PDZ12  (black)  with  N-PDZ12  (red), 
PDZ12-C  (green)  and  full  length  syntenin  (blue).  All  spectra  are  recorded  at  30°C  under 
identical  conditions  (50  mM  Tris  buffer,  150  mM  NaCl,  pH  7.5). 


29 


Tables 


Table  I.  Alignment  tensor  parameters  calculated  for  PDZ1,  PDZ2  and  PDZ12  based  on 
RDC  measured  for  PDZ  domain  tandem  weakly  aligned  in  75+M  gel.  Da  is  magnitude  of 
alignment  tensor  normalized  to  'Dhn,  R  is  rhombicity,  Q  -  quality  factor  and  a,  P,  y  are 
Euler  angles  defining  rotation  about  x,  y  and  z  axes,  respectively,  a)  Euler  angles 
calculated  for  PDZ2  in  the  tandem  structure  transformed  to  principal  axis  frame  of  PDZ  1; 
b)  parameters  of  alignment  tensor  for  PDZ  12  upon  reorientation  of  PDZ2  in  order  to 
match  the  alignment  frame  of  PDZ  1  (values  of  Euler  angles  are  identical  to  that  of  PDZ  1 
and  not  shown). 


PDZ1 

PDZ2 

PDZ  12 

PDZ2a 

PDZ12b 

Da 

6.20 

6.67 

6.49 

6.67 

6.48 

R 

0.58 

0.46 

0.44 

0.46 

0.61 

Q 

22.3 

22.1 

27.7 

22.1 

22.7 

a 

-26.0±0.4 

-32.0±0.6 

-29.5±0.4 

-4.6±0.3 

- 

P 

18.0±0.4 

14.4±0.6 

17.8±0.4 

3.4±0.7 

- 

Y 

-42.2±0.7 

-66.6±0.8 

-55.5±0.9 

-22.7±0.8 

- 
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